One document matched: draft-trammell-ipfix-file-02.txt
Differences from draft-trammell-ipfix-file-01.txt
IPFIX Working Group B. Trammell
Internet-Draft CERT/NetSA
Intended status: Informational E. Boschi
Expires: April 23, 2007 Hitachi Europe
L. Mark
T. Zseby
Fraunhofer FOKUS
October 20, 2006
An IPFIX-Based File Format
draft-trammell-ipfix-file-02.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 23, 2007.
Copyright Notice
Copyright (C) The Internet Society (2006).
Abstract
This document describes a file format for the storage of flow data
based upon the IPFIX message format. It proposes a set of
requirements for flat-file, binary flow data file formats, evaluates
flow storage systems presently in use for their conformance to these
Trammell, et al. Expires April 23, 2007 [Page 1]
Internet-Draft IPFIX Files October 2006
requirements, then applies the IPFIX message format to these
requirements to build a new file format. This IPFIX-based file
format is designed to facilitate interoperability and reusability
among a wide variety of flow storage, processing, and analysis tools.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1. Record Format Flexibility . . . . . . . . . . . . . . . . 6
4.2. Self Description . . . . . . . . . . . . . . . . . . . . . 7
4.3. Data Compression . . . . . . . . . . . . . . . . . . . . . 7
4.4. Indexing and Searching . . . . . . . . . . . . . . . . . . 8
4.5. Data Integrity . . . . . . . . . . . . . . . . . . . . . . 8
4.6. Creator Authentication and Confidentiality . . . . . . . . 9
4.7. Anonymization and Obfuscation . . . . . . . . . . . . . . 9
4.8. Performance Characteristics . . . . . . . . . . . . . . . 10
5. Survey of Existing Flow and Trace File Formats . . . . . . . . 10
5.1. NetFlow V5/V7 . . . . . . . . . . . . . . . . . . . . . . 10
5.2. Argus 2 . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.3. SiLK . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.4. libpcap dumpfile . . . . . . . . . . . . . . . . . . . . . 11
6. IPFIX File Format Description . . . . . . . . . . . . . . . . 12
6.1. Recommended Information Elements for IPFIX Files . . . . . 14
6.1.1. informationElementId . . . . . . . . . . . . . . . . . 14
6.1.2. informationElementAnonymizationType . . . . . . . . . 14
6.1.3. informationElementSemanticType . . . . . . . . . . . . 15
6.1.4. informationElementStorageType . . . . . . . . . . . . 15
6.1.5. messageMD5Checksum . . . . . . . . . . . . . . . . . . 16
6.1.6. messageScope . . . . . . . . . . . . . . . . . . . . . 16
6.1.7. privateEnterpriseNumber . . . . . . . . . . . . . . . 17
6.2. Recommended Options Templates for IPFIX Files . . . . . . 17
6.2.1. Information Element Semantics Options Template . . . . 17
6.2.2. Message Checksum Options Template . . . . . . . . . . 18
6.2.3. Template Anonymization Options Template . . . . . . . 19
6.3. Recommended Compression Strategy for File Writers . . . . 20
7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
8. Security Considerations . . . . . . . . . . . . . . . . . . . 21
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
10. Open Issues and Notes . . . . . . . . . . . . . . . . . . . . 21
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
12.1. Normative References . . . . . . . . . . . . . . . . . . . 22
12.2. Informative References . . . . . . . . . . . . . . . . . . 22
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22
Trammell, et al. Expires April 23, 2007 [Page 2]
Internet-Draft IPFIX Files October 2006
Intellectual Property and Copyright Statements . . . . . . . . . . 24
Trammell, et al. Expires April 23, 2007 [Page 3]
Internet-Draft IPFIX Files October 2006
1. Introduction
This document proposes a file format based upon IPFIX. It begins by
exploring the motivation for proposing a standardized flow file
format, and using IPFIX as the basis for this new file format. It
then proposes a set of requirements for this file format, evaluates
existing flow storage file formats for their conformance to these
requirements, and describes either how the IPFIX message format meets
each requirement, how a file format based upon it could meet the
requirement, or how the message format must be extended to meet the
requirement. It closes by proposing an initial specification of the
new file format and providing examples of IPFIX Files meeting this
specification.
The purpose of this revision of the document is to foster discussion
on the requirements and the initial proposed design this new file
format. It aims to do so without requiring any protocol or message
format extensions, as such are currently out of scope for the IPFIX
working group. Requirements proposed in this document which cannot
be met without such extensions are out of scope for this revision,
and may be addressed in other Internet-Drafts.
2. Terminology
Terms used in this document that are defined in the Terminology
section of the IPFIX Protocol [I-D.ietf-ipfix-protocol] document are
to be interpreted as defined there.
IPFIX File: An IPFIX File is a serialized stream of IPFIX Messages
stored on a filesystem. Any IPFIX Message stream that would be
considered valid when transported one or more of the specified
IPFIX transports (SCTP, TCP, or UDP) as defined in the IPFIX
Protocol draft [I-D.ietf-ipfix-protocol] is considered an IPFIX
File for purposes of this draft; however, this draft further
restricts that definition with recommendations on the construction
of IPFIX Files that meet the requirements identified herein.
IPFIX File Reader: An IPFIX File Reader is a Process which reads
IPFIX Files from a filesystem, and is analogous to an IPFIX
Collecting Process. An IPFIX File Reader MUST behave as an IPFIX
Collecting Process as outlined in the IPFIX Protocol draft
[I-D.ietf-ipfix-protocol], except as modified by this document.
IPFIX File Writer: An IPFIX File Writer is a process which writes
IPFIX Files to a filesystem, and is analogous to an IPFIX
Exporting Process. An IPFIX File Writer MUST behave as an IPFIX
Exporting Process as outlined in the IPFIX Protocol draft
Trammell, et al. Expires April 23, 2007 [Page 4]
Internet-Draft IPFIX Files October 2006
[I-D.ietf-ipfix-protocol], except as modified by this document.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. Motivation
There are a wide variety of applications for the file-based storage
of IP flow data, across a continuum of time scales. Tools used in
the analysis of flow data and creation of analysis products often use
files as a convenient unit of work, with an ephemeral lifetime. A
set of flows relevant to a security investigation may be stored in a
file for the duration of that investigation, and futher exchanged
among incident handlers via email or within an external incident
handling workflow application. Sets of flow data relevant to
Internet measurement research may be published as files, much as
libpcap packet trace files are, to provide common data sets for the
repeatability of research efforts; these files would have lifetimes
measured in months or years. Operational flow measurement systems
also have a need for long-term, archival storage of flow data, either
as a primary flow data repository, or as a backing tier for online
storage in a relational database management system (RDBMS).
The variety of applications of flow data, and the variety of
presently deployed storage approaches, would seem to indicate the
need for a standard approach to flow storage with applicability
across the continuum of time scales over which flow data is stored.
A storage format based around flat files would best address the
variety of storage requirements. While much work has been done on
structured storage via RDBMS, relational database systems are not a
good basis for format standardization owing to the fact that their
internal data structures are generally private to a single
implementation and subject to change for internal reasons. Also,
there are a wide variety of operations available on flat files, and
external tools and standards can be leveraged to meet file-based flow
storage requiremenets. Further, flow data is often not very
semantically complicated, is managed in very high volume, and
therefore an RDBMS-based flow storage system would not benefit much
from the advantages of relational database technology.
The simplest way to create a new file format is simply to serialize
some internal data model to disk, with either textual or binary
representation of data elements, and some framing strategy for
delimiting fields and records. "Ad-hoc" file formats such as this
have several important disadvantages. One, they impose the semantics
of the data model from which they are derived on the file format; as
Trammell, et al. Expires April 23, 2007 [Page 5]
Internet-Draft IPFIX Files October 2006
such, they are difficult to extend, describe, and standardize.
The emergence over the past decade of XML as a new "universal"
framing format for flat as well as hierarchical data addresses these
concerns; however, XML is not necessarily ideal for a storage format
for flow data. First, flow data, being inherently simple and record-
oriented, does not benefit from the more advanced semantics available
with XML. There is not much to be gained by describing each record
individually when the records all have the same format, or one of a
small set of formats. Second, XML processing introduces potentially
significant overhead. While an XML stream should in theory be
approximately as compressible as any other stream representation, the
additional compression/decompression and generation/parsing of XML
data is not worth the benefit in this case.
This leads us to propose the IPFIX message format as the basis for a
new flow data file format. The IPFIX working group, in defining the
IPFIX protocol, has already defined an information model and data
formatting rules for representation of flow data. Especially at
shorter time scales, when a file is a unit of data interrchange, the
filesystem may be viewed as simply another IPFIX message transport
between processes. This format is especially well suited to
representing flow data, as it was designed specifically for flow data
export; it is easily extensible unlike ad-hoc serialization, and
compact unlike XML. In addition, IPFIX is an emerging standard for
the export and collection of flow data; using a common format for
storage and analysis at the collection side allows implementors to
use substantially the same information model and data formatting
implementation for transport as well as storage.
4. Requirements
In this section, we outline a proposed set of requirements for any
persistent storage format for flow data. First and foremost, a flow
data file format should support storage across the continuum of time
scales important to flow storage applications. Each of the
requirements enumerated in the sections below is broadly applicable
to flow storage applications, though each may be more important at
certain time scales. For each, we first identify the requirement,
then explain how the IPFIX message format addresses it, or briefly
outline the changes that must be made in order for an IPFIX-based
file format to meet the requirement.
4.1. Record Format Flexibility
Due to the wide variety of flow attributes collected by different
network flow attribute measurement systems, the ideal flow storage
Trammell, et al. Expires April 23, 2007 [Page 6]
Internet-Draft IPFIX Files October 2006
format will not impose a single data model or a specific record type
on the flows it stores. The file format must be flexible and
extensible; that is, it must support multiple record types definable
within the file itself, and must be able to support new field types
for data within the records in a graceful way.
IPFIX provides extensibility through the use of Templates to describe
each Data Record, through the use of an IANA Registry to define its
Information Elements, and through the use of enterprise-specific
Information Elements.
4.2. Self Description
Archived data may be read at a time in the future where any external
reference to the meaning of the data may be lost. The ideal flow
storage format should be self-describing; that is, a process reading
flow data from storage should be able to properly interpret the
stored flows without reference to anything other than standard
sources (e.g., the standards document describing the file format) and
the stored flow data itself.
The IPFIX message format is partially self-describing; that is, IPFIX
Templates containing only IANA-assigned Information Elements can be
completely interpreted according to the IPFIX Information Model
without additional external data.
However, Templates containing private information elements lack
detailed type and semantic information; a Collecting Process
receiving data described by a template containing private Information
Elements it does not understand can only treat the data contained
within those Information Elements as octet arrays. To be fully self-
describing, enterprise-specific Information Elements must be
additionally described via IPFIX Options according to the Information
Element Semantics Options Template defined below.
4.3. Data Compression
Regardless of the representation format, flow data describing traffic
on real networks tends to be highly compressible. Compression tends
to improve the scalability of flow collection systems, by reducing
the disk storage and I/O bandwidth requirement for a given workload.
The ideal flow storage format should support applications which wish
to leverage this fact by supporting compression of stored data.
The IPFIX message format has no support for data compression, as the
IPFIX protocol was designed for speed and simplicity of export. Of
course, any flat file is readily compressible using a wide variety of
external data compression tools, formats, and algorithms; therefore,
Trammell, et al. Expires April 23, 2007 [Page 7]
Internet-Draft IPFIX Files October 2006
this requirement can be met externally.
However, a couple of simple optimizations can be made by File Writers
to increase the integrity and usability of compressed IPFIX data;
these are outlined in the Recommended Compression Strategy, which
appears below.
4.4. Indexing and Searching
Binary, record stream oriented file formats natively support only one
form of searching, sequential scan in file order. By choosing the
order of records in a file carefully (e.g., by time), a file can be
"indexed" by a single key. Adding additional indexes to the file can
speed searches considerably. The ideal flow storage format will
support a method for noting that the records in a file are sorted by
a certain key or set of keys, and for providing index information for
keys on which the file is not sorted.
There is presently no support for indexing or sort order notation in
the IPFIX message format. If internal indexing is required, it would
need to be added to an IPFIX-based file format by extension. This
revision of this draft does not address this requirement further,
though it may be addressable without protocol or message format
changes.
4.5. Data Integrity
When storing flow data over long time scales, especially for archival
purposes, it is important to ensure that hardware or software faults
do not introduce errors into the data over time. The ideal flow
storage format will support the detection and correction of encoding-
level errors in the data.
Note that more advanced error correction is almost certainly best
handled at a layer below that addressed by this document. Error
correction is a topic well addressed by the storage industry in
general (e.g. by RAID and other technolgies), and by specifying a
flow storage format based upon files, we can leverage these features
to meet this requirement.
However, the ideal flow storage format will be resilient against
errors, providing an internal facility for the detection of errors
and the ability to isolate errors to as few data records as possible.
Note that this requirement interacts with the choice of data
compression algorithm. The use of block compression algorithms can
server to isolate errors to a single compression block, unlike stream
compressors, which may fail to resynchronize after a single bit
Trammell, et al. Expires April 23, 2007 [Page 8]
Internet-Draft IPFIX Files October 2006
error, invalidating the entire message stream. See the Recommended
Compression Strategy below for more on this interaction.
The IPFIX message format does not support data integrity assurance.
It is assumed that advanced error correction will be provided
externally. For simple error detection support, checksums may be
attached to messages via IPFIX Options according to the Message
Checksum Options Template defined below.
4.6. Creator Authentication and Confidentiality
Storage of flow data across long time scales may also require
assurance that no unauthorized entity can read or modify the stored
data. Asymmetric-key cryptography can be applied to this problem, by
signing flow data with the private key of the creator, and encrypting
it with the public keys of those authorized to read it. The ideal
flow storage format will support the encryption and signing of flow
data.
As with error correction, this problem has been addressed well at a
layer below that addressed by this document. Instead of specifying a
particular choice of encryption technology, we can leverage the fact
that existing cryptographic technologies work quite well on data
stored in files to meet this requirement.
Beyond support for the use of TLS for transport over TCP or SCTP,
both of which provide transient authentication and confidentiality,
the IPFIX message format does not support this requirement directly.
It is assumed that this requirement will be met externally.
4.7. Anonymization and Obfuscation
To ensure the privacy of individuals and organizations at the
endpoints of communications represented by flow records, it is often
necessary to obfuscate or anonymize stored and exported flow data.
The ideal flow storage format will provide for a notation that a
given information element on a given record type represents
anonymized, rather than real, data.
The IPFIX message format presently has no support for anonymization
notation. It should be noted that anonymization is one of the
requirements given for IPFIX in RFC 3917 [RFC3917]. The decision to
qualify this requirement with 'MAY' and not 'MUST' in the
requirements document, and its subsequent lack of specification in
the current version of the IPFIX protocol, is due to the fact that
anonymization algorithms are still a research issue, and that there
currently exist no standardized methods for anonymization.
Trammell, et al. Expires April 23, 2007 [Page 9]
Internet-Draft IPFIX Files October 2006
Simple anonymization notation may be attached to templates via IPFIX
Options according to the Template Anonymization Options Template
defined below.
4.8. Performance Characteristics
The ideal standard flow storage format will not have a significant
negative impact on the performance of the application implementing
it. This is a non-functional requirement, but it is important to
note that a standard that implies a performance penalty is unlikely
to be widely implemented.
A static analysis of the IPFIX message format would seem to suggest
that implementations of it are not particularly prone to slowness;
indeed, a template-based data representation is more easily subject
to optimization for common cases than representations that embed
structural information directly in the data stream (e.g. XML) are
not. However, a full analysis of the impact of using IPFIX messages
as a basis for flow data storage on read/write performance will
require more implementation experience and performance measurement.
5. Survey of Existing Flow and Trace File Formats
5.1. NetFlow V5/V7
One de facto standard for the storage of flow data collected via
Cisco NetFlow V5 or V7 is to serialize a stream of "raw" NetFlow
datagrams into files. These NetFlow PDU files consist of a
collection of header- prefixed blocks (corresponding to the datagrams
as received on the wire) containing fixed-length binary flow records.
NetFlow V5 and V7 data may be mixed within a given file, as the
header on each datagram defines the NetFlow version of the records
following; there is indeed very little difference between the two
record formats.
NetFlow V5/V7 PDU files are neither extensible nor self-describing;
however, their status as a de facto standard means the definition of
the data format is well-understood. Indexing, compression, error
detection and correction, authentication, and confidentiality must be
handled externally.
5.2. Argus 2
QoSient's Argus (as of version 2.0.6) uses a file format based upon a
stream of type-and-length prefixed records. There are two general
types of records in this stream, management records and flow records.
Management records export flow collection statistics, much like the
Trammell, et al. Expires April 23, 2007 [Page 10]
Internet-Draft IPFIX Files October 2006
recommended scoped data records in the IPFIX protocol. Flow records
contain information about a single flow each, and are further typed
based upon the protocol of the flow (e.g., IP, ICMP, ARP). The Argus
file format natively spports bidirectional flow export, as each flow
record contains both forward and reverse counters.
The Argus tools support a transport protocol that simply encapsulates
a record stream over a TCP connection. Transport is collector-
initiated; that is, a collector establishes a connection to an
exporter in order to read a record stream.
Argus files are not self-describing; that is, only the Argus tools
themselves encapsulate the definition of each of the record types.
The Argus file format is not extensible without changing the Argus
implementation. Argus provides no indexing facility for its file
format, though records are roughly sorted by record generation time.
Compression, error correction, authentication, and confidentiality
are handled externally to the format, and are available as with all
files. There is no special support for data obfuscation in the
format.
5.3. SiLK
The CERT/NetSA SiLK tools use a set of fixed-length binary record
formats. Each file is prefixed with a header which denotes which
record format the file is stored in. These record formats are
differentiated by the presence or absence of certain fields; in this
way, each format identifier is essentially a short-hand identifier
for a template describing the record. This also implies that only
one type of record may be stored in any given file.
As with Argus, SiLK files are not self-describing and are not
extensible. SiLK provides no indexing facility, though files are
generally stored in flow end time order; and when used for archival
storage, information about sensors and flow times appearing in each
file is stored in the file path name. Compression is handled
internally to the file format, and allows the storage of compressed
data in a file with uncompressed headers, and a guarantee of
compression block boundary alignment with record boundaries. Error
correction, authentication, and confidentiality can be handled
externally. There is no special support for data obfuscation in the
SiLK file format.
5.4. libpcap dumpfile
The libpcap dumpfile format is a packet trace format rather than a
flow file format, so it does not address any of the requirements
outlined above. However, it is used widely in a use case (data
Trammell, et al. Expires April 23, 2007 [Page 11]
Internet-Draft IPFIX Files October 2006
storage and distribution for network measurement research) similar to
one addressed by the format proposed in this draft, so we include it
here.
libpcap dumpfiles consist of a file header containing information
common to the whole file (most importantly, the datalink layer, for
interpretation of the datalink headers on each frame), followed by a
set of raw captured frame records each prefixed by a frame header
containing timestamp and length information. The format is not
particularly flexible or self-describing, nor does it need to be:
undecoded frames are about as semantically simple as network traffic
data can get.
However, the simplicity and ubiquity of the libpcap dumpfile format
has led to its becoming a de facto standard for the distribution of
packet trace data for Internet measurement applications. We propose
the file format described in this draft in part as an analogue to the
libpcap dumpfile format for flow data.
Note that libpcap dumpfiles could be used as a storage format for any
unidirectional, datagram-oriented protocol such as IPFIX or NetFlow,
simply by storing the captured export session. However, this has
several important drawbacks. First, the additional per-packet
headers provided by pcap are redundant in the case of IPFIX, as
length and export time are already available in the IPFIX Message
Header. Second, the link, network, and transport layer headers are
stored in a dumpfile; these are not necessary for the successful
interpretation of an IPFIX Message, and add additional decode
overhead. Third, a file created by capturing an export session may
require additional processing to reassemble fragmented datagrams in
the message stream.
6. IPFIX File Format Description
An IPFIX file, as defined by this draft and elaborated below, is at
its core simply an IPFIX Message stream serialized to some
filesystem. Any valid serialized IPFIX Message stream MUST be
accepted by a File Reader as a valid IPFIX file. In this way, the
filesystem is simply treated as another IPFIX Transport alongside
SCTP, TCP, and UDP, although one with unusually high latency, as the
File Reader and File Writer are not necessarily synchronized in time,
unlike IPFIX Collecting and Exporting Processes.
An IPFIX File Reader MUST accept as valid any IPFIX message stream
that would be considered valid by one or more of the other defined
IPFIX transport layers. Practically, this means that the union of
template management features supported by SCTP, TCP, and UDP MUST be
Trammell, et al. Expires April 23, 2007 [Page 12]
Internet-Draft IPFIX Files October 2006
supported in IPFIX Files:
o Template Sets and Options Template Sets MAY appear in the same
IPFIX Message as Data Sets, as with TCP and UDP.
o Template Sets that define already-defined templates may appear
multiple times in an IPFIX Message Stream, as they would with UDP
template retransmission (as described in section 10.3.6 of the
IPFIX Protocol draft [I-D.ietf-ipfix-protocol]). In the event of
a conflict between a resent definition and a previous definition,
the new template replaces the old, as consistent with UDP template
expiration and ID reuse.
o Template Withdrawals (as described in section 8 of the IPFIX
Protocol draft [I-D.ietf-ipfix-protocol]) may appear and are valid
as long as the Template to be withdrawn is defined, as in TCP and
SCTP. However, as Template IDs may be directly reused as
described above, Template Withdrawals are completely optional in
IPFIX Files.
However, for representation simplicity and read performance, File
Writers SHOULD use the following template and scope management
strategy:
o Template Sets and Options Template Sets SHOULD appear in the file
before any Data Sets, to ensure all Templates are available before
any data is read.
o Data Records described by Options Templates SHOULD appear in the
file before any Data Records which depend on the scopes defined by
those options.
Practically speaking, this means an IPFIX File SHOULD consist of
Template Sets, followed by Options, followed by Data Sets.
A Transport Session SHOULD be synonymous with a single File. In
other words, the beginning of a file SHOULD be as interpreted by a
File Reader as the beginning of a Transport Session, and the end of a
file SHOULD be interpreted by a File Reader as the end of a Transport
Session. This implies that Templates and Options are limited in
scope to the single File in which thet are defined.
However, depending on the application, File Readers and File Writers
MAY be flexibile with respect to their definition of a Transport
Session. A File Reader MAY be configurable to treat a collection of
Files (e.g., all the files in a directory) as a single Transport
Session, especially when used for archival purposes.
Trammell, et al. Expires April 23, 2007 [Page 13]
Internet-Draft IPFIX Files October 2006
6.1. Recommended Information Elements for IPFIX Files
The following information elements are used by the options templates
below to allow IPFIX message streams to meet the requirements
outlined above without extension to the message format or protocol.
IPFIX File Readers and Writers SHOULD support these information
elements as defined below.
6.1.1. informationElementId
Description: An information element ID, as would appear in an IPFIX
Template Record. This element can be used to scope properties to
a specific information element within a Template. This IE should
be encoded with the Enterprise ID bit set to 0, regardless of
whether the Enterprise ID bit is set in the template to which this
IE refers. See the definition of privateEnterpriseNumber below
for more on the use of this IE to describe vendor-specific IEs.
Abstract Data Type: unsigned16
Data Type Semantics: identifier
ElementId: TBD
Status: Proposed
Reference: Section 3.4.1 of the IPFIX Protocol draft
6.1.2. informationElementAnonymizationType
Description: A description of the anonymization status of an IPFIX
information element within a template. If this field is FALSE,
the corresponding IE is not anonymized; to the best ability of the
Exporting Process to determine, it represents a real value. If
this field is TRUE, the corresponding IE is anonymized; to the
best ability of the Exporting Process to determine, it represents
a value that has been transformed to maintain privacy. Note that
if no informationElementAnonymizationType is specified for an
information element, it is assumed to be FALSE, or not anonymized.
Abstract Data Type: boolean
ElementId: TBD
Status: Proposed
Trammell, et al. Expires April 23, 2007 [Page 14]
Internet-Draft IPFIX Files October 2006
6.1.3. informationElementSemanticType
Description: A description of the semantics of an IPFIX information
element within a template. The possible values of this field are
not yet defined; this is an open issue.
Abstract Data Type: octet
ElementId: TBD
Status: Proposed
6.1.4. informationElementStorageType
Description: A description of the storage type of an IPFIX
information element within a template. These correspond to the
abstract data types defined in section 3.1 of the IPFIX
Information Model [I-D.ietf-ipfix-info]; see that section for more
information on the types described below. This field may take the
following values:
+-------+----------------------+
| Value | Description |
+-------+----------------------+
| 0x00 | octetArray |
| 0x01 | unsigned8 |
| 0x02 | unsigned16 |
| 0x03 | unsigned32 |
| 0x04 | unsigned64 |
| 0x05 | signed8 |
| 0x06 | signed16 |
| 0x07 | signed32 |
| 0x08 | signed64 |
| 0x09 | float32 |
| 0x0A | float64 |
| 0x0B | boolean |
| 0x0C | macAddress |
| 0x0D | string |
| 0x0E | dateTimeSeconds |
| 0x0F | dateTimeMilliseconds |
| 0x10 | dateTimeMicroseconds |
| 0x11 | dateTimeNanoseconds |
| 0x12 | ipv4Address |
| 0x13 | ipv6Address |
+-------+----------------------+
Trammell, et al. Expires April 23, 2007 [Page 15]
Internet-Draft IPFIX Files October 2006
Abstract Data Type: octet
ElementId: TBD
Status: Proposed
Reference: Section 3.1 of the IPFIX Information Model
6.1.5. messageMD5Checksum
Description: The MD5 checksum of the IPFIX Message containing this
record. This IE SHOULD be bound to its containing IPFIX Message
via an options record and the messageScope IE, as defined below,
and SHOULD appear only once in a given IPFIX Message. To
calculate the value of this IE, first buffer the containing IPFIX
Message, setting the value of this IE to all zeroes. Then
caluclate the MD5 checksum of the resulting buffer as defined in
RFC 1321 [RFC1321], place the resulting value in this IE, and
export the buffered message.
Abstract Data Type: octetArray (16 bytes)
ElementId: TBD
Status: Proposed
Reference: RFC 1321, The MD5 Message-Digest Algorithm [RFC1321]
6.1.6. messageScope
Description: The presence of this Information Element as scope in
an Options Template signifies that the options described by the
Template apply to the IPFIX Message that contains them. It is
defined for general purpose message scoping of options, and
proposed specifically to allow the attachment a checksum to a
message via IPFIX Options. The value of this Information Element
SHOULD be ignored by the File Reader or the Collecting Process.
Abstract Data Type: octet
ElementId: TBD
Status: Proposed
Trammell, et al. Expires April 23, 2007 [Page 16]
Internet-Draft IPFIX Files October 2006
6.1.7. privateEnterpriseNumber
Description: A private enterprise number used to scope an
information element ID, as would appear in an IPFIX Template
Record. This element can be used to scope properties to a
specific information element within a Template. If the Enterprise
ID bit of the corresponding Information Element is cleared (has
the value 0), this IE should be set to 0. The presence of a non-
zero value in this IE implies that the Enterprise ID bit of the
corresponding Information Element is set (has the value 1).
Abstract Data Type: unsigned32
Data Type Semantics: identifier
ElementId: TBD
Status: Proposed
Reference: Section 3.4.1 of the IPFIX Protocol draft
6.2. Recommended Options Templates for IPFIX Files
The following options templates allow IPFIX message streams to meet
the requirements outlined above without extension to the message
format or protocol. They are defined in terms of existing
Information Elements defined in the IPFIX Information Model
[I-D.ietf-ipfix-info], as well as new Information Elements defined in
the section above. IPFIX File Readers and Writers SHOULD support
these options templates as defined below.
6.2.1. Information Element Semantics Options Template
The Information Element Semantics Options Template specifies the
structure of a Data Record for attaching semantic and storage type
information to enterprise-specific Information Elements in specified
Template Records. Data Records described by this Template SHOULD
appear for each enterprise-specific Information Element used within a
File. Collecting Processes and IPFIX File Readers can use options
data described by this template to improve handling of unknown
information elements. Note that the template MAY be used to describe
public Information Elements, such as Information Elements that may
have been added to the IANA registry after the last update of a given
Collecting Process or File Reader; however, Collecting Processes or
File Readers MUST NOT allow semantic or storage type information
contained within these records to override their own specified
handling of public Information Elements.
Trammell, et al. Expires April 23, 2007 [Page 17]
Internet-Draft IPFIX Files October 2006
The template SHOULD contain the following Information Elements as
defined in the IPFIX Information Model [I-D.ietf-ipfix-info] and
above:
+--------------------------------+----------------------------------+
| IE | Description |
+--------------------------------+----------------------------------+
| templateId | The Template ID of the template |
| | this record describes; it is |
| | assumed to be valid within the |
| | Observation Domain ID of the |
| | containing IPFIX Message, and |
| | MUST identify a Template that |
| | has already been exported. This |
| | Information Element MUST be |
| | defined as a Scope Field. |
| informationElementId | The Information Element |
| | identifier of the Information |
| | Element within the specified |
| | Template this record describes. |
| | This Information Element MUST be |
| | defined as a Scope Field. |
| privateEnterpriseNumber | The Private Enterprise number of |
| | the Information Element within |
| | the specified Template this |
| | record describes. May be 0 if |
| | this record describes a public |
| | Information Element. This |
| | Information Element MUST be |
| | defined as a Scope Field. |
| informationElementStorageType | The storage type of the |
| | specified Information Element. |
| informationElementSemanticType | The semantic type of the |
| | specified Information Element. |
+--------------------------------+----------------------------------+
6.2.2. Message Checksum Options Template
The Message Checksum Options Template specifies the structure of a
Data Record for attaching an MD5 message checksum to an IPFIX
Message. An MD5 message checksum as described MAY be used if long-
term data integrity is important to the application. The described
Data Record MUST appear only once per IPFIX Message.
The template SHOULD contain the following Information Elements as
defined above:
Trammell, et al. Expires April 23, 2007 [Page 18]
Internet-Draft IPFIX Files October 2006
+--------------------+----------------------------------------------+
| IE | Description |
+--------------------+----------------------------------------------+
| messageScope | A marker denoting this Option applies to the |
| | whole IPFIX message; content is ignored. |
| | This Information Element MUST be defined as |
| | a Scope Field. |
| messageMD5Checksum | The MD5 checksum of the containing IPFIX |
| | Message. |
+--------------------+----------------------------------------------+
6.2.3. Template Anonymization Options Template
The Template Anonymization Options Template specifies the structure
of a Data Record for attaching anonymization notation information to
Information Elements in specified Template Records. A Data Record
described by this Template SHOULD appear for each Information Element
within a Template known by the Exporting Process or File Writer to
contain anonymized data.
The template SHOULD contain the following Information Elements as
defined in the IPFIX Information Model [I-D.ietf-ipfix-info] and
above:
+-------------------------------------+-----------------------------+
| IE | Description |
+-------------------------------------+-----------------------------+
| templateId | The Template ID of the |
| | template this record |
| | describes; it is assumed to |
| | be valid within the |
| | Observation Domain ID of |
| | the containing IPFIX |
| | Message, and MUST identify |
| | a Template that has already |
| | been exported. This |
| | Information Element MUST be |
| | defined as a Scope Field. |
| informationElementId | The Information Element |
| | identifier of the |
| | Information Element within |
| | the specified Template this |
| | record describes. This |
| | Information Element MUST be |
| | defined as a Scope Field. |
Trammell, et al. Expires April 23, 2007 [Page 19]
Internet-Draft IPFIX Files October 2006
| privateEnterpriseNumber | The Private Enterprise |
| | number of the Information |
| | Element within the |
| | specified Template this |
| | record describes. May be 0 |
| | if this record describes a |
| | public Information Element. |
| | This Information Element |
| | MUST be defined as a Scope |
| | Field. |
| informationElementAnonymizationType | The anonymization type of |
| | the specified Information |
| | Element. |
+-------------------------------------+-----------------------------+
6.3. Recommended Compression Strategy for File Writers
Note that, since any file may be compressed and decompressed with a
variety of widely available tools implementing a variety of
compression standards (both specified and de facto), compression of
IPFIX File data can be accomplished externally. However, compression
at the file level may not be particularly resilient to errors; in the
worst case, a single bit error in a stream-compressed file may result
in the loss of the entire file.
To limit the impact of errors on the recoverability of compressed
data, we recommend the use of block compression where possible.
However, block-compressed IPFIX Files also have some recovery
problems, because it is difficult to resynchronize a partially
damaged IPFIX Message stream due to the fact that the IPFIX version 1
beginning-of-message marker (the Version field of the Message Header,
0x00 0x0A) may commonly appear in the body of an IPFIX Message.
Therefore, in applications (e.g. archival storage) in which error
resilience is very important, we recommend that File Writers align
compression block boundaries with IPFIX Message boundaries, so that
each new compression block starts with a new IPFIX Message. This can
be achieved either by manually adjusting the block boundaries (for
compression facilities which support this), or by padding out the
IPFIX Message Stream with a Data Set described by a Template
containing a single one-byte paddingOctets Information Element to
reach a known compression block boundary. Note that this latter
strategy requires a minimum padding of 5 bytes (4 byte set header
followed by at least one byte of padding).
Trammell, et al. Expires April 23, 2007 [Page 20]
Internet-Draft IPFIX Files October 2006
7. Examples
Examples are not yet available as the file format has not yet been
fully described. A future revision of this document will contain
examples.
8. Security Considerations
The IPFIX-based file format itself does not directly introduce
security issues. Rather it is used to store information which may
for privacy or business issues be considered sensitive. The file
format must therefore provide appropriate procedures to guarantee the
integrity and confidentiality of the stored information.
The underlying protocol used to exchange the information that will be
stored using the format proposed in this document must as well apply
appropriate procedures to guarantee the integrity and confidentiality
of the exported information. Such issues are addressed in separate
documents, specifically in the IPFIX Protocol
[I-D.ietf-ipfix-protocol].
9. IANA Considerations
This document requests the addition of the information elements in
section 6.1 to the IANA IPFIX Information Element Registry.
10. Open Issues and Notes
The survey of existing file formats is incomplete, and includes only
file formats with which one of the authors has personal experience.
[bht]
The set of semantic data types is unspecified. [bht]
Need to be a bit more concrete with respect to what requirements we
cannot meet with XML. [bht]
Need to address indexing and searching, or at least be more explicit
about why we're not doing so. [bht]
11. Acknowledgements
Thanks to Arno Wagner for technical assistance with the requirements.
Trammell, et al. Expires April 23, 2007 [Page 21]
Internet-Draft IPFIX Files October 2006
12. References
12.1. Normative References
[I-D.ietf-ipfix-protocol]
Claise, B., "Specification of the IPFIX Protocol for the
Exchange of IP Traffic Flow Information",
draft-ietf-ipfix-protocol-23 (work in progress),
October 2006.
[I-D.ietf-ipfix-info]
Quittek, J., "Information Model for IP Flow Information
Export", draft-ietf-ipfix-info-13 (work in progress),
September 2006.
[RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
April 1992.
12.2. Informative References
[RFC3917] Quittek, J., Zseby, T., Claise, B., and S. Zander,
"Requirements for IP Flow Information Export (IPFIX)",
RFC 3917, October 2004.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
Authors' Addresses
Brian H. Trammell
CERT Network Situational Awareness
Software Engineering Institute
4500 Fifth Avenue
Pittsburgh, Pennsylvania 15213
United States
Phone: +1 412 268 9748
Email: bht@cert.org
Trammell, et al. Expires April 23, 2007 [Page 22]
Internet-Draft IPFIX Files October 2006
Elisa Boschi
Hitachi Europe SAS
Immeuble Le Theleme
1503 Route les Dolines
06560 Valbonne
France
Phone: +33 4 89874100
Email: elisa.boschi@hitachi-eu.com
Lutz Mark
Fraunhofer Institute for Open Communication Systems
Kaiserin-Augusta-Allee 31
10589 Berlin
Germany
Phone: +49 30 3463 7306
Email: mark@fokus.fraunhofer.de
Tanja Zseby
Fraunhofer Institute for Open Communication Systems
Kaiserin-Augusta-Allee 31
10589 Berlin
Germany
Phone: +49 30 3463 7153
Email: zseby@fokus.fraunhofer.de
Trammell, et al. Expires April 23, 2007 [Page 23]
Internet-Draft IPFIX Files October 2006
Full Copyright Statement
Copyright (C) The Internet Society (2006).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Trammell, et al. Expires April 23, 2007 [Page 24]
| PAFTECH AB 2003-2026 | 2026-04-24 04:30:33 |