One document matched: draft-bagnulo-lmap-http-02.xml
<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes" ?>
<?rfc compact="yes"?>
<?rfc subcompact="yes"?>
<?rfc text-list-symbols="o*+-"?>
<rfc obsoletes="" updates="" category="std" ipr="trust200902"
docName="draft-bagnulo-lmap-http-02">
<front>
<title abbrev="LMAP Protocol">
Large MeAsurement Platform Protocol
</title>
<author initials="M." surname="Bagnulo" fullname="Marcelo Bagnulo">
<organization abbrev="UC3M">
Universidad Carlos III de Madrid
</organization>
<address>
<postal>
<street>Av. Universidad 30</street>
<city>Leganes</city>
<region>Madrid</region>
<code>28911</code>
<country>SPAIN</country>
</postal>
<phone>34 91 6249500</phone>
<email>marcelo@it.uc3m.es</email>
<uri>http://www.it.uc3m.es</uri>
</address>
</author>
<author initials="T." surname="Burbridge" fullname="Trevor Burbridge">
<organization abbrev="BT">
British Telecom
</organization>
<address>
<postal>
<street>Adastral Park, Martlesham Heath</street>
<city>IPswitch</city>
<country>ENGLAND</country>
</postal>
<email>trevor.burbridge@bt.com</email>
</address>
</author>
<author initials="S." surname="Crawford" fullname="Sam Crawford">
<organization abbrev="SamKnows">
SamKnows
</organization>
<address>
<email>sam@samknows.com</email>
</address>
</author>
<author fullname="Juergen Schoenwaelder" initials="J."
surname="Schoenwaelder">
<organization>Jacobs University Bremen</organization>
<address>
<postal>
<street>Campus Ring 1</street>
<city>28759 Bremen</city>
<country>Germany</country>
</postal>
<email>j.schoenwaelder@jacobs-university.de</email>
<uri></uri>
</address>
</author>
<author fullname="Vaibhav Bajpai" initials="V."
surname="Bajpai">
<organization>Jacobs University Bremen</organization>
<address>
<postal>
<street>Campus Ring 1</street>
<city>28759 Bremen</city>
<country>Germany</country>
</postal>
<email>v.bajpai@jacobs-university.de</email>
<uri></uri>
</address>
</author>
<date month="July" year="2014"/>
<area>Operations and Management</area>
<abstract>
<t>
This documents specifies the LMAP protocol based on HTTP for the Control
and Report in Large Scale Measurement Platforms.
</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>A Large MeAsurement Platform (LMAP) is an infrastructure deployed in the
Internet that enables performing measurements from a very large number of
vantage points.</t>
<t>The main components of a LMAP are the following:
<list style="symbols">
<t>The Measurement Agents (MAs): these are the processes that perform the
measurements. The measurements can be both active or passive
measurements. </t>
<t>The Controller: this is the element that controls the MAs. In
particular it provides configuration information and it instructs the MA
to perform a set of measurements.</t>
<t>The Collector: this is the repository where the MAs send the results of
the measurements that they have performed.</t>
</list>
</t>
<t>These and other terms used in this document are defined in <xref
target="I-D.ietf-lmap-framework" />. We only include the definition of the
main elements in this document so it is self-contained and can be read
without the need to consult other documents. The reader is referred to the
terminology draft for further details.</t>
<t>In order for a LMAP to work, the following protocols are required:
<list style="symbols">
<t>Measurement protocols: These are the protocols used between the MA and
the Measurement Peer in active measurements. These are the actual
packets being used for the measurement operations.</t>
<t>Control Protocol. This is the protocol between the Controller and the
MAs. This protocol is used to convey measurement Instruction(s) from the
Controller to the MA as well as logging, failure and capabilities
information from the MA to the Controller.</t>
<t>Report Protocol. This is the protocol between the MAs and the
Collector. This protocol conveys information about the results of the
measurements performed by the MA to the Collector.</t>
</list>
</t>
<t>Both the Control protocol and the Report protocol have essentially two
parts:a transport and a data model. The data model represents the
information about measurement instructions and logging/failure/capabilities
(in the Control protocol) and the information about measurement results (in
the Report protocol) that is being exchanged between the parties. The
transport is the underlying protocol used to exchange that information. This
document specifies the use of HTTP 1.1 <xref target='RFC7230'/> <xref
target='RFC7231'/> <xref target='RFC7232'/> <xref target='RFC7233'/> <xref
target='RFC7234'/> <xref target='RFC7235'/> as a transport for the Control
and the Report protocol. This document also defines the data model for the
Control and Report protocols. The data model described in this document
follows the information model described in <xref
target="I-D.ietf-lmap-information-model" />. The Measurement protocols
are out of the scope for this document.</t>
<t>At this stage, the goal of this document is to explore different options
that can be envisioned to use the HTTP protocol to exchange LMAP information
and to foster discussion about which one to use (if any). Because of that,
the document contains several discussion paragraphs that explore different
alternative approaches to perform the same function.</t>
</section>
<section title="Overview">
<t>This section provides an overview of the architecture envisioned for a LMAP
using HTTP as transport protocol. As we described in the previous section,
a LMAP is formed by a large number of MAs, one or more Controllers and one
or more Collectors. We assume that before the MAs are deployed, it is
possible to pre-configure some information in them. Typically this includes
information about the MA itself (like its identifier), security information
(like some certificates) and information about the Controller(s) available
in the measurement platform. Once that the MA is deployed it will retrieve
additional configuration information from the pre configured Controller.
After obtaining the configuration information, the MA is ready to receive
Instructions from the Controller and initiate measurement tasks. The MA
will perform the following operations:
<list style="symbols">
<t>It will obtain Instructions from one of the configured Controllers.
These Instructions include information about the set of measurement
tasks to be performed, a schedule for the execution of the measurements
as well as a set of report channels. This information is downloaded by
the MA from the Controller. The MA will periodically check whether there
are new Instructions available from the Controller. This document
specifies how the MA uses the HTTP protocol to retrieve information from
the Controller. </t>
<t>The MA will execute measurement tasks either by passively listening to
traffic or by actively sending and receiving measurement packets. How
this is done is out of the scope of this document.</t>
<t>After one or more measurements have been performed, the MA reports the
results to the Collector. The timing of these uploads is specified in
the measurement Instruction i.e. each measurement specified in a
measurement Instruction contains a report information, defining when the
MA should report the results back to the Collector. This document
specifies how the MA uses the HTTP protocol to upload the measurement
results to the Collector.</t>
<t>In addition, the MA will periodically report back to the Controller
information about its capabilities (like the number of interfaces it
has, the corresponding IP addresses, the set of measurement methods it
supports, etc) and also logging information (whether some of the
requested measurement tasks failed and related information).</t>
</list>
</t>
</section>
<section title="Naming Considerations">
<t>In this section we define how the different elements of the LMAP
architecture are identified and named.</t>
<t>The Controller and the Collectors can be assumed to have both an IP address
and a Fully Qualified Domain Name (FQDN). It is natural to use these as
identifiers for these elements. In this document we will use FQDNs, but IP
addresses can be used as well.</t>
<t>The MAs on the other hand, are likely to be executed in devices located in
the end user premises and are likely to be located behind a NAT box. It is
reasonable to assume they have neither a public IP address nor a FQDN. We
propose then that the MAs are identified using an Universally Unique
IDentifier URN as defined in <xref target='RFC4122'>RFC 4122</xref>. In
particular each MA has a version 4 UUID, which is randomly or pseudo
randomly generated. </t>
<t>DISCUSSION:
<list>
<t>MA ID Configuration: Some open issues related to this are: a) whether
the MA ID is configured before of after the MA is deployed, b) if
configured after deployment whether the MA ID is generated locally and
posted or fetched from the Controller and c) whether this is within the
scope of this (or other) specification if any. These issues seem also to
be related to the nature of the MA platform (wether the MA is a software
downloaded into a general purpose device or it is a special purpose
hardware box). Consider the case that the MA is located in a special
purpose hardware box, then having the MA ID pre configure before
deployment requires a per device customization that is expensive. It
would be more costly efficient to reuse an existent (hopefully) unique
identifier available in the hardware (such as a MAC address) to serve as
a one-time pre configured identifier to be used to fetch (or post a self
generated) the MA ID from the Controller once the MA is deployed. The
requirement for such one-time identifier is that they must be unique
(which is not always true for the MACs). About the local generation of
the MA ID (as opposed to fetch it from the Controller), the generation
process performed in the MA MUST be idempotent, i.e. if the MA was
factory-reset then the server would still see it with the same MA ID
when it came back up. This is probably easier to achieve if it is
generated in the Controller and then fetched by the MA. Finally, it is
not clear at this stage if this needs to be specified in this document
or in the information model document or left open to the
implementers.</t>
<t>Group identifiers. In some cases, like the case of measurements in
mobile devices, it may be important because of privacy considerations
for the MA not to have a unique identifier. It is possible then to
assign "Group identifiers" to a set of devices that share relevant
characteristics from the measurement perspective (e.g. devices from the
same operator, with the same type of contract or other relevant
feature). In this case, the MAs within the same group would retrieve
common measurement Instructions from the controller by presenting the
same Group ID and would report results including the Group ID in the
report. This would imply that it would not be possible for the platform
to correlate specific measurement data with any given MA. The downside
of this is that some MAs may be over-represented while other
under-represented in the measurement data and it would not be possible
to detect this case (for instance a given MA may have reported 20
results while another one only one). In order to deal with this issue,
the MA behaviour must be programmed accordingly (e.g. the MA should not
perform more than one measurement every given period of time).In
addition, it should be noted that privacy is only achieved in a holistic
way. This means that really anonymity of the MA is incompatible with
strong authentication. In particular, if a measurement platform's goal
is to keep MAs anonymous, it cannot require any form of strong
authentication (other than weak group authentication e.g. a password
shared by a group), which has security implications. In particular, the
threat for report forgery (i.e. enabling an attacker to submit forged
reports as discussed in the security considerations) increases.</t>
</list>
</t>
<t>There are additional naming considerations related to:
<list style="symbols">
<t>The measurements. In order to enable a Controller to properly convey a
measurement schedule, it must be possible for the Controller to specify
a measurement to be performed while providing the needed input
parameters. While this is critical, it is out of the scope of this
document. There is a proposed registry for metrics/measurements in <xref
target="I-D.bagnulo-ippm-new-registry-independent" />) </t>
<t>The resources being exchanged, namely, the configuration information,
the measurement Instructions and the reports. These are being discussed
in the upcoming sections.</t>
</list>
</t>
</section>
<section title="Information model">
<t>The information model for LMAP is described <xref
target="I-D.ietf-lmap-information-model" />. It contains basically two
models one for the control information (i.e. the Instructions from the
Controller to the MA) and a model for the Report information. We briefly
describe their overall structure here.</t>
<t>The control information (or Instruction) has the following five elements:
<list style="symbols">
<t>The Set of Measurement Task Configurations: This element defines the
measurements/test that the MA will perform without defining the schedule
when they will be performed.</t>
<t>The Set of Report Channels: This element defines the set of collectors
as well as the reporting schedules for the reports.</t>
<t>The Set of Measurement Schedules for Repeated Tasks: defines the
schedules for the repeated measurements, by referencing the measurement
tasks defined in the second element.</t>
<t>Suppression information</t>
</list></t>
<t>Summary of Report information model here.</t>
<t>Summary of Capability and Status information model here.</t>
<t>Summary of Logging information model here.</t>
</section>
<!--
<section title="Example">
<t>Before describing the actual data models and the options for using the HTTP protocol for conveying control and report information, we will describe a simple example that hopefully will provide an overview of the proposed LMAP protocol.</t>
<t>Consider a simple scenario with these elements: a Controller with FQDN controller.example.org, a Collector with FQDN collector.example.org and a MA.</t>
<t>Suppose we want to instruct the MA to perform the following measurement and the following reports:
<list style="symbols">
<t>A UDP latency test, without cross-traffic, that reports the 99th percentile mean of a burst of packets sent following a Poisson distribution that lasts for 30 seconds and with rate 5 packets per second. The destination address is 192.0.2.1 and the destination and source port are 50000. We want to repeat this test for 7 days every hour. Report the results every hour.</t>
<t>An ICMP packet-loss ratio test, without cross traffic, that reports the ratio between packets lost and packet sent of periodic streams that send one packet every 1 second for 30 seconds. We want to perform this test once per day for the next month.</t>
</list></t>
<t>Assume that both the Controller and the Collector are deployed. Before deploying the MA, the MA must be configured with a UUID. Let's suppose the UUID for this particular MA is f47ac10b-58cc-4372-a567-0e02b2c3d479. In addition to its UUID, the MA must be configured with the certificate of the CA used to generate the certificates for the Controller (i.e. controller.example.org) and the collector (i.e. collector.example.org). In addition, the URL for the Instruction information must be configured in the MA. This URL is composed by the FQDN of the Controller plus a well-known path prefix (as defined in <xref target='RFC5785'>RFC 5785</xref>), namely /.well-known/lmap/ma-info. For this particular example, the URL for the Instruction is: http://controller.example.org/.well-known/lmap/ma-info/</t>
<t>Once the MA is deployed, it uses the POST method to retrieve the Agent Information element of the Instruction from the controller as follows
<list>
<t>POST /.well-known/lmap/ma-info/ HTTP/1.1</t>
<t>Host: controller.example.com</t>
<t>Content-Type: application/lmap-maid+json</t>
<t>Accept: application/lmap-config+json</t>
<t>{</t>
<t> "ma-id" : "f47ac10b-58cc-4372-a567-0e02b2c3d479",</t>
<t>}</t>
</list></t>
<t>The Controller then returns the Agent Information for this specific agent which contains basically the URLs for the remaining Control elements. For this particular example, the Agent information returned looks like this:</t>
<figure>
<artwork><![CDATA[
{
"ma-id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"version": "1.0",
"measurement-set": "http://controller.example.org
/measurements/",
"report-channel-set": "http://controller.example.org
/channels/",
"repeated-schedule-set": "http://controller.example.org
/schedules/"
"status-set": "http://controller.example.org
/status/"
"logging-set": "http://controller.example.org
/logging/"
}
]]></artwork>
</figure>
<t>The Agent Information retrieved by the MA contains the URL for the remaining elements of the Instruction. In order to retrieve them, the MA executes the POST method on the retrieved URLs. This approach containing one level of indirections allows that the different components (measurements, report channels and measurement schedules) are updated with a different frequency. We expect that the report channels will be fairly static, the measurements updated a bit more frequently and the schedules to be updated frequently. This would imply that the schedule resource will be retrieved frequently while the other two not so much. In addition, the Agent Information retrieved contain the URLs to deliver both status and logging information. The MA will execute the POST method for conveying this information to the Controller.</t>
<t>The first thing that the MA does is to send its status information to the Controller. this contains information about the MA capabilities and local configuration, such as interfaces' information and supported measurement task. For this particular example, the MA would execute the following method:</t>
<figure>
<artwork><![CDATA[
POST /.well-known/lmap/ma-info/ HTTP/1.1
Host: controller.example.com
Content-Type: application/lmap-maid+json
Accept: application/lmap-config+json
{
"ma-id" : "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"ma-interfaces: [
{"if-name" : eth0
"if-type" : ethernetCsmacd
"ip-adddress" :{
"protocol": v4
"address": 10.1.1.1
}
}]
"supported-measurements":[
{"metric": UDP_Latency}]
}
]]></artwork>
</figure>
<t>At this point, the Controller has information about the interfaces in the MA and the supported measurement methods. (Please keep in mind that this is kept very simple to facilitate the description, but in normal operation, it is likely that the configuration will contain much more information).</t>
<t>The MA will next retrieve the set of Instructions from the Controller. The POST for the Instructions will result in the following information:</t>
A UDP latency test, without cross-traffic, that reports the 99th percentile mean of a bust of packets sent following a Poisson distribution that last for 30 seconds and with rate 5 packets per second. We want to repeat this test for 7 days every hour.
<figure>
<artwork><![CDATA[
{
"name": "standard tests",
"version": "1.0",
"tests": [
{
"name": "latency",
"description": "UDP round trip latency",
"metric": "UDP_Latency",
"options": [
{
"environment": "No-cross-traffic",
"Output-type": "Xth-percentile-mean",
"X": "99",
"Scheduling": "Poisson",
"rate": "5",
"duration": "30.000",
"interface": "eth0"
"destination-ip": {
"version": "4",
"value": "192.0.2.1"
},
"destination-port": "50000",
"source-port": "50000"
}
]
}
]
}
]]></artwork>
</figure>
<t>The values for "metric", "environment", "Output-type" and "scheduling" are defined in the registry specified in <xref target="I-D.bagnulo-ippm-new-registry-independent" />. Many of the values under the 'options' field will be dependent on
the metric itself, and they can therefore use their respective units, naming or values.</t>
<t>The POST for the report channels retrieves the following:</t>
<figure>
<artwork><![CDATA[
{
"name": "internal channels",
"version": "1.0",
"description": "hourly report to main database collector",
"reports": {
"name": "hourly report",
"description": "hourly report to main database",
"collector": "http://collector.example.org/results/f47ac10b-58cc-4372-a567-0e02b2c3d479",
"timing": {
"timing_type": "calendar",
"timing-config": {
"minutes": ["22"],
"seconds": ["40"]
}
}
}
}
]]></artwork>
</figure>
<t>The POST for the schedule retrieves the following:</t>
<figure>
<artwork><![CDATA[
{
"name": "hourly measurements",
"version": "1.0",
"schedules": [
{
"name": "Hourly",
"tests": ["latency"],
"reports" :["hourly report"],
"timing": {
"timing_type": "calendar",
"timing-config": {
"minutes": ["05"],
"seconds": ["30"]
}
}
}
]
}
]]></artwork>
</figure>
<t>At this point, the MA has obtained the information about the measurement it has been instructed to perform and it is now ready to do it. It then sends the first batch of UDP packets for 30 seconds. Once that it has finished doing this, it calculates the 99th percentile mean of the round trip time, let's say that it was 10 milliseconds. Since there are no other measurements performed in the next hour, it will report only this result to the Collector. In order to do that, the MA will execute the POST method to the URL retrieved in the report channel resource (i.e. http://collector.example.org/results/ in this example) and it will send the following information:</t>
<figure>
<artwork><![CDATA[
{
"report-date": "utc-milliseconds",
"reporting-agent": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"results": {
"test-name": "latency",
"test-agent": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"test-parameters": {
"name": "latency",
"description": "UDP round trip latency",
"metric": "UDP_Latency",
"options": [
{
"environment": "No-cross-traffic",
"Output-type": "Xth-percentile-mean",
"X": "99",
"Scheduling": "Poisson",
"rate": "5",
"duration": "30.000",
"destination-ip": {
"version": "4",
"value": "192.0.2.1"
},
"source-IP-address": {
"version": "4",
"value": "198.151.100.34"
},
"destination-port": "50000",
"source-port": "50000",
"start-time": "utc-milliseconds",
"end-time": "utc-milliseconds"
}
]
},
"test-results": {
"Xth-percentile-mean": "10"
}
}
}
]]></artwork>
</figure>
<t>Finally, the MA will use the POST method to convey the logging information to the Controller. We take the approach that only errors are logged, so, in this example, the MA would post an empty log to the Controller showing that there were no errors.</t>
</section>
-->
<section title="Transport protocol">
<section title="Pre-configured information">
<t>As we mentioned earlier, the MAs contain pre-configured information before
being deployed. The pre-configured information is the following:
<list style="symbols">
<t>The UUID for the MA. This should be pre-configured so that the Controller
is aware of the MA and can feed configuration information and measurement
Instructions to it.</t>
<t>Information about one or more Controllers. The MA MUST have enough
information to create the URL for the Instruction resources. This includes
the the FQDN of each of the Controller or the IP addresses of the
Controller, as well as the well-known path prefix and its identifier. </t>
<t>The certificate for the Certification authority that is used in the
platform to generate the certificates for the Controller and the
Collector. See the Security considerations section below.</t>
<t>The security related information for the MA (it can be a certificate for
the MA and the corresponding private key, or simply a key/password
depending on the security method used, see the security considerations
section below).</t>
</list>
</t>
</section>
<section title="Control Protocol">
<t>The Control protocol is used by the MA to retrieve Instruction information
from the Controller. In this section we describe how to use HTTP to
transport Instructions. The Instruction information is structured as defined
in the LMAP Information model <xref target="I-D.ietf-lmap-information-model"
/> as described in the previous section. The MA uses the Control protocol
to retrieve all the resources described above, namely, the Agent
information, the Set of Measurement Task Configurations, the Set of Report
Channels, the Set of Measurement Schedules for Repeated Tasks and the Set of
Measurement Schedules for Isolated Tasks. The main difference from the HTTP
perspective is that the MA MUST have the URL for the Agent Information
resource pre-configured as described in the previous section, while the URLs
for all the other resources are contained in the Agent Information resource
itself.</t>
<section title="Retrieving Instructions">
<t>In order to retrieve the Instruction resources from the Controller the MA
can use either the GET or the POST method using the corresponding URL. </t>
<section title="Using the GET method">
<t>One way of using the GET method to retrieve configuration information is to
explicitly name the configuration information resources and then apply the
GET method. The MA retrieves its Instruction when it is first connected to
the network and periodically after that. The frequency for the periodical
retrieval is contained in the Agent Information (???).</t>
<t>The URL for the Agent Information resource is formed as the FQDN of the
Controller, a well-known path prefix and the MA UUID. The well-known path
prefix is /.well-known/lmap/ma-info. The URL for the remaining resources
that compose the Instruction are contained in the Agent Information.</t>
<t>Agent Information retrieval: In order to retrieve the Agent information the
MA uses the HTTP GET method follows:
<list>
<t>GET /.well-known/lmap/ma-info/ < ma-iid> HTTP/1.1</t>
<t>Host: FQDN or IP of the Controller</t>
<t>Accept: application/json (as per <xref target="RFC7159" />)</t>
</list>
The Agent Information should contain the Configuration Retrieval Schedule
(i.e. how often the MA should retrieve configuration information) and also
the Measurement Instruction Retrieval Schedule (i.e. how often the MA should
retrieve the Measurement Instruction from the Controller). COMMENT: this is
missing from the Data Model</t>
<t>The retrieval of the remaining resources of the Instruction using the GET
method is analogous, only that the URL is extracted from the Agent
Information file rather than constructed with pre-configured
information.</t>
<t>The format for the response should be described here</t>
<t>Periodical Instruction retrieval: After having downloaded the initial
Instruction information, the MA will periodically look for updated
Instruction information. The frequency with which the MA polls for the new
Instructions from the Controller is contained in the last Agent Information
downloaded. In order to retrieve the Agent Information, the MA uses the GET
method as follows:
<list>
<t>GET /.well-known/lmap/ma-info/ma-iid/ HTTP/1.1</t>
<t>Host: FQDN or IP of the Controller</t>
<t>Accept: application/json (as per <xref target="RFC7159" />)</t>
<t>If-None-Match: the eTag of the last retrieved Agent Information (an
alternative option here is to use If-Modified-Since, not sure which one
is best)</t>
</list>
</t>
<t>For the other Instruction resources, the GET method is applied in the same
way just that the URL used are the ones retrieved in the last Agent
Information.</t>
<t>The format for the response should be described here</t>
<t>Alternatively, instead of explicitly naming the Instruction resources for
each MA, it is possible to perform a query using the GET method as well. In
this case, the MA could perform a GET for the following URI
http://controller.example.org/?ma=maid & q=ma-info (similar queries can
be constructed for the other Instruction resources). (I am not sure how to
express in this case the condition that the MA wishes to retrieve the
configuration if it is newer than the last one it downloaded.)</t>
</section>
<section title="Using the POST method">
<t>An alternative to retrieve Instruction resources is to use the POST method
to perform a query (similar to the query using GET). In this case there is
no explicit naming of the Instruction information of each MA, but a general
Instruction resource and the POST method is used to convey a query for the
Instruction information of a particular MA. For the case of the Agent
Information resource, this would look like as follows:
<list>
<t>POST /.well-known/lmap/ma-info/ma-iid/ HTTP/1.1</t>
<t>Host: controller.example.com</t>
<t>Content-Type: application/lmap-maid+json</t>
<t>Accept: application/lmap-config+json</t>
<t>{</t>
<t> "ma-id" : "550e8400-e29b-11d4-a716-446655440000",</t>
<t>}</t>
</list></t>
<t>The reply for this query would contain the actual configuration information
as follows:
<list>
<t>HTTP/1.1 200 OK</t>
<t>Content-Length: xxx</t>
<t>Content-Type: application/lmap-config+json</t>
<t>{</t>
<t> // whatever config goes here</t>
<t>}</t>
</list></t>
<t>In this case, the URLs contained in the Agent information can be generic
and not MA specific, since the MA will use the POST method including its own
identifier when retrieving the Instruction resources.</t>
<t>The argument for this approach is that this is much more extensible since
the POST can carry complex information and there is no need to "press"
arguments into the strict hierarchy of URIs.</t>
<t>We need to describe how to use this to retrieve newer information in the
periodic case.</t>
</section>
</section>
<section title="Handling communication failures">
<t>The cases that the MA is unable to retrieve the Instructions are handled
as follows:
<list style="symbols">
<t>The MA will use a timeout for the communication of TIMEOUT seconds.
The value of TIMEOUT MUST be configurable via the aforementioned
Configuration Information retrieval protocol. The default value for
the TIMEOUT is 3 seconds. If after the timeout, the communication with
the Controller has not been established, the MA will retry doing an
exponential backoff and doing a round robin between the different
Controllers it has available.</t>
<t>If a HTTP error message (5xx) is received from the Controller as a
response to the GET request, the MA will retry doing an exponential
back-off and doing a round robin between the different Controllers it
has available. The 5xx error codes indicate that this Controller is
currently incapable of performing the requested operation.</t>
</list>
</t>
</section>
<section title="Pushing Information from the Controller to the MA">
<t>The previous sections described how the MA periodically polls the
Controller to retrieve Instruction information. The frequency of the
downloads is configurable. The question is whether this is enough or a
mechanism for pushing Instruction information is needed. Such method would
enable to contact the MA in any moment and take actions like triggering a
measurement right away or for instance to stop an ongoing measurement (e.g.
because it is disturbing the network). The need for such a mechanism is
likely to depend on the use case of the platform. Probably the ISP use case
is more likely to require this feature than the regulator/benchmarking use
case. It is probably useful then to provide this as an optional feature.</t>
<t>The main challenge in order to provide this feature is that the MAs are
likely to be placed behind NATs, so it is not possible for the Controller to
initiate a communication with the MA unless there is a binding in the NAT to
forward the packets to the MA. There are several options that can be
considered to enable this communication:
<list style="symbols">
<t>The MA can use one of the NAT control protocols, such as PCP or UPNP.
If this approach is used, the MA will create a binding in the NAT
opening a hole. After that, the MA should inform the Controller about
which is the IP address and port available for communication. It would
be possible to re-use existing protocols to forward this information.
The problem with this is that the NAT may not support these protocols or
they may not be activated. In any case, a solution should try to use
them in the case they are available.</t>
<t>If it is not possible to use a NAT control protocol, then the MA can
open a hole in the NAT by establishing a connection to the Controller
and keeping it open. This allows the Controller to push information to
the MA through that connection. One concern with this approach is that
the MA is playing the role of the client and the Controller is playing
the role of the server (the MA is initiating the TCP connection), but it
would be the Controller who would use the PUT method towards the MA
reversing the roles. An alternative approach is that the MA has a long
running GET pending which is answered by the server if the measurement
Instruction changes (or the server times out, in which case the MA
restarts the long running GET. More discussion is needed about whether
one of these options is acceptable or not. In addition, this would imply
that the Controller should maintain as many open sessions as MAs it is
managing, which imposes additional burden in the Controller. There are
security considerations as well, but these are covered in the Security
Considerations section below.</t>
</list>
</t>
</section>
</section>
<section title="Report protocol">
<t> The MA after performing the measurements reports the results to a
collector. There can be more than one collector within a LMAP framework.
Each collector is identified by its FQDN or IP address which is retrieved as
part of the Agent information from a pre-configured controller as previously
discussed. The number of Collectors that the MA uploads the results to as
well as the schedule when it does so is defined in the measurement
Instruction previously downloaded from the Controller. The MA themselves are
identified by a UUID.</t>
<t>There are two options that can be considered for the MA to upload reports
to the Collector either to use the PUT method or to use the POST method.</t>
<t>If the PUT method option is used, then the MA need to perform the PUT
method using an explicit name for the report resource it is transferring to
the Collector. The name of the resource is contained in the Agent
Information previously retrieved by the MA </t>
<t>The other option is for the MA to use the POST method to upload the
measurement reports to one or more Collectors. In this case,, the POST
message body can contain the identifier of the MA and additional information
describing the report in addition to the report itself. </t>
<t>One argument to consider is that PUT is idempotent. This means that if the
network is bad at some point and the MA is not sure whether its request made
it through, it can send it a second (or nth) time, and it is guaranteed that
the request will have exactly the same effect as sending it for the first
time. POST does not by itself guarantee this. This can be achieved by
verifying the report data itself, and contrast it with data already stored
int he Collector database. </t>
<section title="Handling communication failures">
<t>The MA will use a timeout for the communication with the Collector of
TIMEOUT seconds. The value of TIMEOUT MUST be configurable via the
aforementioned Configuration Information retrieval protocol. The default
value for the TIMEOUT is 3 seconds.</t>
<t>If the MA is uploading the report to several Collectors and it manages to
establish the communication before TIMEOUT seconds with at least one of
them, but not with one or more of the other Collectors, then the MA gives up
after TIMEOUT seconds and it MAY issue an alarm. The definition of how to do
that operation is out of the scope of this document.</t>
<t>If the MA is uploading the report to only one Collector, and it does not
manages to establish a communication before TIMEOUT seconds, then it retry
doing an exponential backoff and doing a round robin between the different
Collectors it has available.</t>
<t>Similarly, if an HTTP error message (5xx) is received from the Collector as
a response to the PUT request, the MA will retry doing an exponential
backoff and doing a round robin between the different Collectors it has
available. The 5xx error codes indicate that this Collector is currently
incapable of performing the requested operation.</t>
<t>In order to support this, the information model must express the difference
between a report sent to multiple collectors and multiple collectors used
for fallback.</t>
</section>
</section>
</section>
<section title="LMAP Data Model">
<t>This section will contain the data model in json.</t>
<section title="Timing Information">
<t> An example immediate timing object with no defined randomness is shown
below:</t>
<figure>
<artwork>
-- ma_timing_obj
{
"id": 1
, "ma_timing_option": "IMMEDIATE"
, "ma_randomness_option": null
, "ma_timing_name": null
}
</artwork>
</figure>
</section>
<section title="Channels">
<t> An example channel object using the aforementioned timing object is shown
below:</t>
<figure>
<artwork>
-- ma_channel_obj
{
"id": 1
, "ma_channel_timing_obj_id": 1
, "ma_channel_connect_always": "true"
, "ma_channel_target": "controller.example.org"
, "ma_channel_certificate": "MIIFEzCCAvsCAQEwDQYJ"
, "ma_channel_interface_name": "eth0"
, "ma_channel_name": "INSTRUCTION"
}
-- ma_channel_obj
{
"id": 2
, "ma_channel_timing_obj_id": 1
, "ma_channel_connect_always": "true"
, "ma_channel_target": "controller.example.org"
, "ma_channel_certificate": "VtAKQhFM89kOIxn5g..."
, "ma_channel_interface_name": "eth0"
, "ma_channel_name": "MA-TO-CONTROLLER"
}
-- ma_channel_obj
{
"id": 3
, "ma_channel_timing_obj_id": 1
, "ma_channel_connect_always": "true"
, "ma_channel_target": "collector.example.org"
, "ma_channel_certificate": "X1Ow9+Grmkb9EmVPfqH0..."
, "ma_channel_interface_name": "eth0"
, "ma_channel_name": "REPORT"
}
</artwork>
</figure>
</section>
<section title="Pre-Configuration">
<t> An example pre-config object using the aforementioned channel objects is
shown below:</t>
<figure>
<artwork>
-- ma_preconfig_obj
{
"id": 1
, "ma_instruction_channel_obj_id": 1
, "ma_ma_to_controller_channel_obj_id": 2
, "ma_device_id": "01:23:45:67:89:ab"
, "ma_agent_id": null
}
</artwork>
</figure>
</section>
<section title="Configuration">
<t> An example config object using the aforementioned channel objects is shown
below:</t>
<figure>
<artwork>
-- ma_config_obj
{
"id": 1
, "ma_agent_id": "c54c284a01ee11e48dd310ddb1bd23b5"
, "ma_group_id": "d7d63d7a01ee11e49b2210ddb1bd23b5"
, "ma_instruction_channel_obj_id": 1
, "ma_report_ma_id_flag": "false"
, "ma_instruction_channel_failure_threshold": 10
}
</artwork>
</figure>
</section>
<section title="Instruction">
<t> An example instruction object is shown below:</t>
<figure>
<artwork>
-- ma_instruction_obj
{
"id": 1
, "ma_supression_obj_id": 1
}
</artwork>
</figure>
<section title="Measurement Supression">
<t> An example supression object used by the aforementioned instruction object
is shown below:</t>
<figure>
<artwork>
-- ma_supression_obj
{
"id": 1
, "ma_supression_enabled": "true"
, "ma_supression_start": 1404309159
, "ma_supression_end": 1404309193
}
</artwork>
</figure>
</section>
<section title="Measurement Task Configurations">
<t> An example task object used by the aforementioned instruction object is
shown below:</t>
<figure>
<artwork>
-- ma_task_obj
{
"id": 1
, "instruction_obj_id": 1
, "supression_obj_id": 1
, "ma_task_name": "UDP latency"
, "ma_task_registry": "urn:ietf:ippm..."
, "ma_task_options": "..." # omitted for brevity reasons
, "ma_task_cycle_id": "1"
}
</artwork>
</figure>
</section>
<section title="Measurement Schedules">
<t> An example schedule object used by the aforementioned instruction object
is shown below:</t>
<figure>
<artwork>
-- ma_schedule_obj
{
"id": 1
, "instruction_obj_id": 1
, "supression_obj_id": 1
, "timing_obj_id": 1
, "ma_schedule_name": "A schedule with immediate timing"
}
</artwork>
</figure>
<figure>
<artwork>
-- ma_sched_task_obj
{
"id": 1
, "schedule_obj_id": 1
, "ma_schedule_task_name": "A schedule for UDP latency task"
}
</artwork>
</figure>
</section>
<section title="Report Channels">
<t> An example schedule report object used by the aforementioned instruction
object is shown below:</t>
<figure>
<artwork>
-- ma_sched_report_obj
{
"id": 1
, "ma_schedule_obj_id": 1
, "channel_obj_id": 3
, "ma_schedule_task_report_channel_name": "A report channel"
, "ma_schedule_task_filter": null
}
</artwork>
</figure>
</section>
</section>
<section title="MA to Controller">
<t> An example log object is shown below:</t>
<figure>
<artwork>
-- ma_log_obj
{
"id": 1
, "ma_log_agent_id": "0e49b32b01fa11e4bcaf10ddb1bd23b5"
, "ma_log_event_time": 1404313752
, "ma_log_code": "200"
, "ma_log_description": "OK"
}
</artwork>
</figure>
</section>
<section title="Capability and Status">
<t> An example status object is shown below:</t>
<figure>
<artwork>
-- ma_status_obj
{
"id": 1
, "ma_agent_id": "c54c284a01ee11e48dd310ddb1bd23b5"
, "ma_device_id": "01:23:45:67:89:ab"
, "ma_hardware": "TL-MR3020"
, "ma_software": "Busybox"
, "ma_firmware": "4560"
, "ma_last_measurement": 1404315031
, "ma_last_report": 1404315053
, "ma_last_instruction": 140431312
, "ma_last_configuration": 140423245
}
</artwork>
</figure>
<t> An example capability object is shown below:</t>
<figure>
<artwork>
-- ma_capability_obj
{
"id": "1"
, "ma_status_obj_id": 1
, "ma_measurement_id": "c56cb44a028c11e495d910ddb1bd23b5"
, "ma_measurement_version": "v1.0"
}
</artwork>
</figure>
<t> An example interface object is shown below:</t>
<figure>
<artwork>
-- ma_interface_obj
{
"id": 1
, "ma_status_obj_id": 1
, "ma_interface_name": "eth0"
, "ma_interface_type": "100baseTX"
, "ma_interface_speed": "100Mbps"
, "ma_link_layer_address": "01:23:45:67:89:ab"
}
</artwork>
</figure>
<t> An example ip address object used by the aforementioned interface object
is shown below:</t>
<figure>
<artwork>
-- ip_address
{
"id": 1
, "interface_obj_id": 1
, "value": "192.168.1.10"
, "ma_interface_if_ip_address": 1
, "ma_interface_if_dns_server": 0
, "ma_interface_if_gateway": 0
}
</artwork>
</figure>
</section>
<section title="Reporting">
<t> An example report object is shown below:</t>
<figure>
<artwork>
-- ma_report_obj
{
"id": 1
, "ma_report_agent_id": "c54c284a01ee11e48dd310ddb1bd23b5"
, "ma_report_group_id": "d7d63d7a01ee11e49b2210ddb1bd23b5"
, "ma_report_date": 1404316528
}
</artwork>
</figure>
<figure>
<artwork>
-- ma_report_task_obj
{
"id": 1
, "ma_task_obj_id": 1
, "ma_report_obj_id": 1
, "ma_report_task_column_headers": "...,...,..."
}
</artwork>
</figure>
<figure>
<artwork>
-- ma_result_row_obj
{
"id": 1
, "ma_report_task_obj_id": 1
, "ma_report_result_time": 1404317298
, "ma_report_result_cross_traffic": null
, "ma_report_result_values": "...,...,"
}
</artwork>
</figure>
</section>
</section>
<section title="Security considerations">
<t>Large Measurement Platforms may result in a security hazard if they are not
properly secured. This is so because they encompass a large number of MAs
that can be managed and coordinated easily to generate traffic and they can
potentially be used for generating DDoS attacks or other forms of security
threats. </t>
<t>From the perspective of the protocols described in this documents, we can
identify the following threats:
<list style="symbols">
<t>Hijacking: Probably the worst threat is that an attacker takes over the
control of one or more MAs. In this case the attacker would be able to
instruct the MAs to generate traffic or to eavesdrop traffic in their
location. It is then critical that the MA is able to strongly
authenticate the Controller. An alternative way to achieve this attack
is to alter the communication between the Controller and the MAs. In
order to prevent this form of attack, integrity protection of the
communication between the Controller and the MAs is required. </t>
<t>Polluting: Another type of attack is that an attacker is able to
pollute the Collectors database by providing false results. In this
case, the attacker would attempt to impersonate one or more MAs and
upload fake results in the Collector. In order to prevent this, the
authentication of the MAs with the Collector is needed. An alternative
way to achieve this is for an attacker to alter the communication
between the MA and the Collector. In order to prevent this form of
attack, integrity protection of the communication between the MA and the
Collector is needed.</t>
<t>Disclosure: Another threat is that an attacker may gather information
about the MAs and their configuration and the Measurement schedules. In
order to do that, it would connect to the Controller and download the
information about one or more MAs. This can be prevented by using MA
authentication with the Controller. An alternative mean to achieve this
would be for the attacker to eavesdrop the communication between the MA
and the Controller. In order to prevent this, confidentiality in the
communication between the MA and the Controller is required. Similarly,
an attacker may wish to obtain measurement result information by
eavesdropping the communication between the MA and the Collector. In
order to prevent this, confidentiality in the communication between the
MA and the Collector is needed.</t>
</list></t>
<t>In order to address all the identified threats, the HTTPS protocol must be
used for LMAP (i.e. using HTTP over TLS). HTTPS provides confidentiality,
integrity protection and authentication, satisfying all the aforementioned
needs. Ideally, mutual authentication should be used. In any case, server
side authentication MUST be used. In order to achieve that, both the
Controller and the Collector MUST have certificates. The certificate of the
CA used to issue the certificates for the Controller and the Collector MUST
be pre configured in the MAs, so they can properly authenticate them. As
mentioned earlier, ideally, mutual authentication should be used. However,
this implies that certificates for the MAs are needed. Certificate
management for a large number of MAs may be expensive and cumbersome.
Moreover, the major threats identified are the ones related to hijacking of
the MAs, which are prevented by authenticating the Controller. MAs
authentication is needed to prevent Polluting and Disclosure threats, which
are less severe. So, in this case, alternative (cheaper) methods for
authenticating MAs can be considered. The simplest method would be to simply
use the MA UUID as a token to retrieve information. Since the MA UUID is 128
bit long, it is hard to guess. It would be also possible to use a password
and use the HTTP method for authentication. It is not obvious that managing
passwords for a large number of MAs is easier than managing certificates
though.</t>
<t>An additional security consideration is posed by the mechanism to push
information from the Controller to the MAs. If this method is used, it would
be possible its abuse by an attacker to control the MAs. This threat is
prevented by the use of HTTPS. If HTTPS is used in the established
connection between the MA and the Controller, the only effect that a packet
generated by an external attacker to the MA or the Controller would be to
reset the HTTPS connection, requiring the connection to be
re-established.</t>
<t>It is required in this document that both the Controller and that the
Collector are authenticated using digital certificates. The current
specification allows for the MA to have information about the certificate of
the Certification authority used for generating the Controller and Collector
certificates while the actual certificates are exchanged in band using TLS.
Another (more secure) option is to perform certificate pinning i.e. to
configure in the MAs the actual certificates rather than the certification
authority certificate. Another measure to increase the security would be to
limit the domains that the FQDNs of the Controller and/or the Collector
(e.g. only names in the exmaple.org domain).</t>
<t>Large scale measurements can have privacy implications, especially in some
scenarios like mobile devices performing measurements. In this memo we have
considered using Group IDs to the MA in order to avoid the possibility for
the platform to track each individual MA that is feeding results.</t>
</section>
<section title="IANA Considerations">
<t>Registration of the well-known URL</t>
</section>
<section title="Acknowledgments">
</section>
</middle>
<back>
<references title='Normative References'>
<?rfc include='reference.RFC.4122'?>
<?rfc include='reference.RFC.7159'?>
<?rfc include='reference.RFC.7230'?>
<?rfc include='reference.RFC.7231'?>
<?rfc include='reference.RFC.7232'?>
<?rfc include='reference.RFC.7233'?>
<?rfc include='reference.RFC.7234'?>
<?rfc include='reference.RFC.7235'?>
<!--<?rfc include='reference.RFC.5785'?>-->
<?rfc include='reference.I-D.ietf-lmap-information-model'?>
</references>
<references title='Informative References'>
<?rfc include='reference.I-D.bagnulo-ippm-new-registry-independent'?>
<?rfc include='reference.I-D.ietf-lmap-framework'?>
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 01:06:03 |