One document matched: draft-trammell-ipfix-a8n-00.xml


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY draftIpfixMedps PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-mediators-problem-statement.xml'>
<!ENTITY draftIpfixMedframe PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ipfix-mediators-framework.xml'>
<!ENTITY rfc3330 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3330.xml'>
<!ENTITY rfc3917 PUBLIC '' 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.3917.xml'>
<!ENTITY rfc5101 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5101.xml">
<!ENTITY rfc5102 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5102.xml">
<!ENTITY rfc5103 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5103.xml">
<!ENTITY rfc5153 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5153.xml">
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc5470 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5470.xml">
<!ENTITY rfc5472 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5472.xml">
<!ENTITY rfc5610 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5610.xml">
<!ENTITY rfc5655 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5655.xml">
] >
<rfc ipr="trust200902" category="exp" docName="draft-trammell-ipfix-a8n-00.txt">
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<?rfc toc="yes"?>
<?rfc symrefs="yes"?>

<front>
  <title abbrev="IPFIX Aggregation">
    Exporting Aggregated Flow Data using the IP Flow Information Export (IPFIX) Protocol 
  </title>
  <author initials="B." surname="Trammell" fullname="Brian Trammell">
    <organization abbrev="ETH Zurich">
      Swiss Federal Institute of Technology Zurich 
    </organization>
    <address>
      <postal>
        <street>Gloriastrasse 35</street>
        <city>8092 Zurich</city>
        <country>Switzerland</country>
      </postal>
      <phone>+41 44 632 70 13</phone>
      <email>trammell@tik.ee.ethz.ch</email>
    </address>
  </author>
  <author initials="E." surname="Boschi" fullname="Elisa Boschi">
    <organization abbrev="ETH Zurich">
      Swiss Federal Institute of Technology Zurich 
    </organization>
    <address>
      <postal>
        <street>Gloriastrasse 35</street>
        <city>8092 Zurich</city>
        <country>Switzerland</country>
      </postal>
      <email>boschie@tik.ee.ethz.ch</email>
    </address>
  </author>
  <author initials="A." surname="Wagner" fullname="Arno Wagner">
    <organization abbrev="Consecom AG">
      Consecom AG 
    </organization>
    <address>
      <postal>
        <street>Bellariastrasse 11</street>
        <city>8002 Zurich</city>
        <country>Switzerland</country>
      </postal>
      <email>arno@wagner.name</email>
    </address>
  </author>
  <date month="July" day="5" year="2010"></date>
  <area>Operations</area>
  <workgroup>IPFIX Working Group</workgroup>
  <abstract> 

    <t>This document describes the export of aggregated Flow information using
    IPFIX. An aggregated Flow is essentially an IPFIX Flow with an externally
    imposed time interval, generally representing packets from one or more
    original Flows. The document describes aggregated Flow export within the
    framework of IPFIX Mediators, defines an interoperable,
    implementation-independent method for aggregated Flow export, covering
    time interval export, counter distribution and export, and counting of
    original Flows.</t>

  </abstract>
</front>

<middle>

    <section title="Introduction">

        <t>The aggregation of packet data into flows serves a variety of
        different purposes, as noted in <xref target="RFC3917"/> and <xref
        target="RFC5472"/>. Aggregation beyond the flow level, into records
        representing multiple flows, is a common analysis and data reduction
        technique as well. Often these aggregation operations result in a time
        series of keys and counters, which is useful for gaining insight into
        trends and anomalies. Aggregation may also has added benefits for
        privacy: when data about specific transactions and hosts are
        aggregated together, it has an anonymising effect on the data.</t>

        <t>Aggregation can be applied in parallel with storage of original
        packet and flow data. In certain situations, aggregation can be
        performed at mediator, with aggregated data can be used for initial
        analysis (e.g., to provide a reduced data stream for analysis at a
        central point, with original flow data stored in a
        higher-access-latency manner (e.g. tape storage at the mediator at a
        remote site). Other applications may benefit from a reverse
        application: online analysis of original flows, with long-term storage
        of aggregated data. Aggregation can also be used to provide a
        lower-volume, lower-privacy-risk source of data for reporting or data
        exchange across organizational boundaries.</t>

        <t>While IPFIX can be used to export <xref target="RFC5101"/> and
        store <xref target="RFC5655"/> aggregated data, there exists as yet no
        common terminology or aggregation metadata representation for this
        purpose, no set of requirements to define what is meant by
        aggregation, and no facility for counting original Flows from which
        Aggregated Flows are derived. This document seeks to remedy this
        situation, defining a common basis for the application of IPFIX to the
        handling of aggregated data, with applicability to large-scale network
        data processing, archiving, and inter-organization exchange.</t>
        
    </section>

    <section title="Terminology" anchor="sec-terminology">

        <t>Terms used in this document that are defined in the Terminology
        section of the <xref target="RFC5101">IPFIX Protocol</xref> document
        are to be interpreted as defined there.</t>

        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>

        <t>In addition, this document defines the following terms</t>:
        
        <list style="hanging">

            <t hangText="Aggregated Flow: ">A Flow, as defined by <xref
            target="RFC5101"/>, derived from a set of zero or more original
            Flows within a defined time interval. The two primary differences
            between a Flow and an Aggregated Flow are (1) that the time
            interval of a Flow is generally derived from information about the
            timing of the packets comprising the Flow, while the time interval
            of an Aggregated Flow are generally externally imposed; and (2)
            that an Aggregated Flow may represent zero packets (i.e., an
            assertion that no packets were seen for a given Flow Key in a
            given time interval).</t>

            <t hangText="(Intermediate) Aggregation Function: ">A mapping from
            a set of zero or more original Flows into a set of aggregated
            Flow, that separates the original Flows into a set of one or more
            given time intervals.</t>

            <t hangText="(Intermediate) Aggregation Process: ">An Intermediate
            Process, as in <xref
            target="I-D.ietf-ipfix-mediators-framework"/>, hosting an
            Intermediate Aggregation Function.</t>

        </list>

        <t>[TODO: may want to define a more precise terminology for aggregated flow time interval, "contribution"/"presence" of an original Flow within an aggregated Flow...]</t>

    </section>

    <section title="Requirements for Aggregation Support in IPFIX">

    <t>In defining a terminology, additional information elements, and
    metadata export methods for Aggregated Flow export using IPFIX, we have
    sought to meet the following requirements.</t>

    <t>First, a specification of Aggregated Flow export must seek to be as
    interoperable as possible. Export of Aggregated Flows using the techniques
    described in this document will result in Flow data which can be collected
    by Collecting Processes and read by File Readers which do not provide any
    special support for Aggregated Flow export.</t>

    <t>Second, a specification of Aggregated Flow export must seek to be as
    implementation-independent as the IPFIX protocol itself. In <xref
    target="sec-arch"/>, we specify the flow aggregation process as an
    intermediate process within the <xref
    target="I-D.ietf-ipfix-mediators-framework">IPFIX Mediator framework</xref>,
    and specify a variety of different architectural arrangements for flow
    aggregation; these are meant to be descriptive as opposed to proscriptive.
    In metadata export, we seek to define properties of the set of exported
    Aggregated Flows, as opposed to the properties of the specific algorithms
    used to aggregate these Flows. Specifically out of scope for this effort
    are any definition of a language for proscribing aggregation records, or
    the configuration parameters of Aggregation Processes.</t>

    <t>From the definition of presented in <xref target="sec-terminology"/>,
    an Aggregated Flow is a Flow as in <xref target="RFC5101"/>, with a
    restricted defintion as to the packets making up the Flow. Practically
    speaking, Aggregated Flows are derived from original Flows, as opposed to
    a raw packet stream. Key to this definition of Aggregated Flow is how
    timing affects the process of aggregation, as for the most part flow
    aggregation takes place within some set of (usually regular) time
    intervals. Any specification for Aggregated Flow export must account for
    the special role time intervals play in aggregation, and the many-to-many
    relationship between Aggregated Flows and original Flows which this
    implies.</t>

    </section>
    
     <section title="Aggregation of IP Flows">

        <t>[TODO: frontmatter: restate definition: a flow is just a flow. Here
        we talk about Aggregation Process actions.]</t>

        <section title="A general model for IP Flow Aggregation">

            <t>[TODO: introduce as little of an Aggregation Process as
            necessary to be able to talk about things practically. Define
            in simple terms the role of externally imposed time intervals on
            aggregation. ]</t>

        </section>

        <section title="Aggregating and Distributing Counters" anchor="sec-distro">

            <t>In general, counters in Aggregated Flows are treated the same
            as in any Flow: on a per-Information Element basis, the counters
            are calculated as if they were derived from the set of packets in
            the original flow. For the most part, when aggregating original
            Flows into Aggregated Flows, this is simply done by summation.</t>

            <t>However, this raises a complication when aggregating original
            Flows for which original packet data is not available: often, when
            imposing an external time interval on an original Flow, the
            original Flow will incompletely cover one or more time intervals,
            and apply to one or more Aggregated Flows; in this case, the
            Aggregation Process must distribute the counters in the original
            Flows across the multiple Aggregated Flows. There are several
            methods for doing this, listed here in increasing order of
            complexity and accuracy:</t>

            <list style="hanging">

                <t hangText="End Interval: ">The counters for an original Flow
                are added to the counters of the appropriate Aggregated Flow
                containing the end time of the original Flow.</t>

                <t hangText="Start Interval: ">The counters for an original
                Flow are added to the counters of the appropriate Aggregated
                Flow containing the start time of the original Flow.</t>

                <t hangText="Mid Interval: ">The counters for an original Flow
                are added to the counters of a single appropriate Aggregated
                Flow containing some timestamp between start and end time of
                the original Flow.</t>

                <t hangText="Simple Uniform Distribution: ">Each counter for
                an original Flow is divided by the number of time intervals
                the original Flow covers (i.e., of appropriate Aggregated
                Flows sharing the same Flow Key), and this number is added to
                each corresponding counter in each Aggregated Flow.</t>

                <t hangText="Proportional Uniform Distribution: ">Each counter
                for an original Flow is divided by the number of time _units_
                the original Flow covers, to derive a mean count rate. This
                mean count rate is then multiplied by the number of time units
                in the intersection of the duration of the original Flow and
                the time interval of each Aggregated Flow. This is like simple
                uniform distribtion, but accounts for the fractional portions
                of a time interval covered by an original Flow in the first
                and last time interval.</t>

                <t hangText="Simulated Process: ">Each counter of the original
                Flow is distributed among the intervals of the Aggregated
                Flows according to some function the Aggregation Process uses
                based upon properties of Flows presumed to be like the
                original Flow. For example, bulk transfer flows might follow a
                more or less proportional uniform distribtion, while
                interactive processes are far more bursty.</t>

                <t hangText="Direct: ">The Aggregating Process has access to
                the original packet timings from the packets making up the
                original Flow, and uses these to distribute or recalculate the
                counters.</t>

            </list>

            <t>A method for exporting the distribution of counters across
            multiple Aggregated Flows is detailed in <xref
            target="sec-ex-distro"/>. In any case, counters MUST be
            distributed across the multiple Aggregated Flows in such a way
            that the total count is preserved; this property allows data to be
            aggregated and re-aggregated without any loss of original count
            information. To avoid confusion in interpretation of the
            aggregated data, all the counters for a set of given original
            Flows SHOULD be distributed via the same method.</t>

        </section>
        
        <section title="Counting Original Flows" anchor="sec-flowcount">

            <t>When aggregating multiple original Flows into an Aggregated
            Flow, it is often useful to know how many original Flows are
            present in the Aggregated Flow. This document introduces four new
            information elements in <xref target="sec-ex-flowcount"/> to
            export these counters.</t>
            
             <t>There are two possible ways to count original Flows, which we
            call here conservative and non-conservative. Conservative flow
            counting has the property that each original Flow contributes
            exactly one to the total flow count within a set of aggregated
            Flows. In other words, conservative flow counters are distributed
            just as any other counter, except each original Flow is assumed to
            have a flow count of one. When a count for an original Flow must
            be distributed across a set of Aggregated Flows, and a
            distribution method is used which does not account for that
            original Flow completely within a single Aggregated Flow,
            conservative flow counting requires a fractional
            representation.</t>

            <t>By contrast, non-conservative flow counting is used to count
            how many flows are represented in an Aggregated Flow. Flow
            counters are not distributed in this case. An original Flow which
            is present within N Aggregated Flows would add N to the total
            non-conservative flow count, one to each Aggregated Flow.</t>

        </section>

    </section>

    <section title="Aggregation in the IPFIX Architecture" anchor="sec-arch">
    
      <t>[TODO: describe these diagrams]</t>
      
      <figure title="Potential Aggregation Locations" anchor="loc-fig">
        <artwork><![CDATA[
+==========================================+
| Exporting Process                        |
+==========================================+
  |                                      |
  |             (Aggregated Flow Export) |
  V                                      |
+=============================+          |
| Mediator                    |          |
+=============================+          |
  |                                      |
  | (Aggregating Mediator)               |
  V                                      V
+==========================================+
| Collecting Process                       |
+==========================================+
        |
        | (Aggregation for Storage)
        V
+--------------------+
| IPFIX File Storage |
+--------------------+
        ]]></artwork>
      </figure>

      <figure title="Data flows through the aggregation process" anchor="iap-dataflows">
                <artwork><![CDATA[
packets --+                     +- IPFIX Messages -+
          |                     |                  |
          V                     V                  V
+==================+ +====================+ +=============+
| Metering Process | | Collecting Process | | File Reader |
+==================+ +====================+ +=============+
          |            original | Flows            |
          V                     V                  V
+=========================================================+
|           Intermediate Aggregation Process (IAP)        |
+=========================================================+
          | Aggregated                  Aggregated |
          | Flows                            Flows |
          V                                        V
+===================+                       +=============+
| Exporting Process |                       | File Writer |
+===================+                       +=============+
          |                                        |
          +------------> IPFIX Messages <----------+
        ]]></artwork>
              </figure>

       <figure title="Possible aggregation arrangements in the IPFIX architecture" anchor="iap-arrangements">
                <artwork><![CDATA[
         +----+  +-----+  +----+
 pkts -> | MP |->| IAP |->| EP |-> Export of Aggregated Flows
         +----+  +-----+  +----+
         +----+  +-----+  +----+
IPFIX -> | CP |->| IAP |->| EP |-> Aggregating Mediator
         +----+  +-----+  +----+
         +----+  +-----+  +----+
IPFIX -> | CP |->| IAP |->| FW |-> Aggregation for storage in IPFIX Files
         +----+  +-----+  +----+
         +----+  +-----+  +----+
IPFIX -> | FR |->| IAP |->| FW |-> Aggregation for analysis
 File    +----+  +-----+  +----+   (aggregating File manipulator)
                ]]></artwork>
              </figure>


    </section>

    <section title="Export of Aggregated IP Flows using IPFIX">

        <t>[TODO: frontmatter: here we talk about export specifics.]</t>

        <section title="Flow Count Export" anchor="sec-ex-flowcount">

          <t>[TODO: describe these four IEs. for now, see <xref target="sec-flowcount"/>]</t>

          <section title="originalFlowsPresent Information Element" anchor="ie-noncon-flowcount">
            <list style="hanging">
              <t hangText="Description: ">
                The non-conservative count of original Flows containing packets from which a this Aggregated Flow was aggregated.</t>
              <t hangText="Abstract Data Type: ">unsigned64</t>
              <t hangText="ElementId: ">TBD1</t>
              <t hangText="Status: ">Proposed</t>
            </list>
          </section>      

          <section title="originalFlowsInitiated InformationElement" anchor="ie-con-flowstartcount">
            <list style="hanging">
              <t hangText="Description: ">
                The conservative count of original Flows whose first packet is represented within this Aggregated Flow.</t>
              <t hangText="Abstract Data Type: ">unsigned64</t>
              <t hangText="ElementId: ">TBD2</t>
              <t hangText="Status: ">Proposed</t>
            </list>
          </section>      

          <section title="originalFlowsCompleted InformationElement" anchor="ie-con-flowendcount">
            <list style="hanging">
              <t hangText="Description: ">
                The conservative count of original Flows whose last packet is represented within this Aggregated Flow.</t>
              <t hangText="Abstract Data Type: ">unsigned64</t>
              <t hangText="ElementId: ">TBD3</t>
              <t hangText="Status: ">Proposed</t>
            </list>
          </section>      

          <section title="originalFlows InformationElement" anchor="ie-con-flowcount">
            <list style="hanging">
              <t hangText="Description: ">
                The conservative count of original Flows represented within this Aggregated Flow; may be distributed via any of the methods described in <xref target="sec-distro"/></t>
              <t hangText="Abstract Data Type: ">float64</t>
              <t hangText="ElementId: ">TBD4</t>
              <t hangText="Status: ">Proposed</t>
            </list>
          </section>      

        </section>
        
        <section title="Aggregate Counter Distibution Export" anchor="sec-ex-distro">

            <t>[TODO: options template and IE for representing the
            distribution methods in <xref target="sec-distro"/>]</t>

        </section>        

        <section title="Time Interval Export">

            <t>Since an Aggregated Flow is simply a Flow, the existing
            timestamp Information Elements in the IPFIX Information Model
            (e.g., flowStartMilliseconds, flowEndNanoseconds) are sufficient
            to specify the time interval for aggregation. Therefore, this
            document specifies no new aggregation-specific Information
            Elements for exporting time interval information.</t>

            <t>Each Aggregated Flow SHOULD contain both an interval start and
            interval end timestamp. If an exporter of Aggregated Flows omits
            the interval end timestamp from each Aggregated Flow, the time
            interval for Aggregated Flows within an Observation Domain and
            Transport Session MUST be regular and constant, and SHOULD be
            exported using the Options Record described in this section.
            However, note that this approach might lead to interoperability
            problems when exporting Aggregated Flows to non-aggregation-aware
            Collecting Processes and downstream analysis tasks; therefore, an
            Exporting Process capable of exporting only interval start
            timestamps MUST provide a configuration option to export interval
            end timestamps as well.</t>

            <t>[TODO: options template and IE for binding a time interval to a
            session.]</t>

        </section>


    </section>

    <section title="Examples">

        <t>[TODO]</t>

    </section>

    <section title="Security Considerations">

        <t>[TODO]</t>

    </section>

    <section title="IANA Considerations">

        <t>[TODO: add all IEs defined in Section 6.]</t>

    </section>

  </middle>
  
  <back>

  <references title="Normative References">
    &rfc5101;
    &rfc5102;
    &rfc5610;
    &rfc5655;
    &rfc3330;
    &rfc5156;
  </references>

  <references title="Informative References">
    &rfc5103;
    &rfc5472;
    &rfc5470;
    &draftIpfixMedframe;
    &draftIpfixMedps;
    &rfc5153;
    &rfc3917;
    &rfc2119;
  </references>

</back>
</rfc>

PAFTECH AB 2003-20262026-04-24 05:42:29