One document matched: draft-thomson-ecrit-civic-boundary-02.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY RFC4119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4119.xml">
<!ENTITY RFC4776 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4776.xml">
<!ENTITY RFC5139 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5139.xml">
<!ENTITY RFC5222 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5222.xml">
<!ENTITY RFC5646 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5646.xml">
<!ENTITY RFC5774 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5774.xml">
<!ENTITY I-D.ietf-ecrit-rough-loc PUBLIC "" "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ecrit-rough-loc.xml">
<!ENTITY I-D.ietf-ecrit-lost-sync PUBLIC "" "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-ecrit-lost-sync.xml">
]>
<?rfc symrefs="yes"?>
<?rfc sortrefs="no"?>
<?rfc iprnotified="yes"?>
<?rfc strict="yes"?>
<?rfc toc="yes"?>
<?rfc tocindent="yes"?>
<?rfc rfcedstyle="no"?>
<rfc category="info" ipr="trust200902">
<front>
<title abbrev="Civic Boundaries">
Describing Boundaries for Civic Addresses
</title>
<author initials="M." surname="Thomson" fullname="Martin Thomson">
<organization>Andrew Corporation</organization>
<address>
<postal>
<street>Andrew Building (39)</street>
<street>Wollongong University Campus</street>
<street>Northfields Avenue</street>
<city>Wollongong</city>
<region>NSW</region>
<code>2522</code>
<country>Australia</country>
</postal>
<phone>+61 2 4221 2915</phone>
<email>martin.thomson@andrew.com</email>
</address>
</author>
<author initials="K.H." surname="Wolf" fullname="Karl Heinz Wolf">
<organization>nic.at GmbH</organization>
<address>
<postal>
<street>Karlsplatz 1/2/9</street>
<city>Wien</city>
<code>A-1010</code>
<country>Austria</country>
</postal>
<phone>+43 1 5056416 37</phone>
<email>karlheinz.wolf@nic.at</email>
<uri>http://www.nic.at/</uri>
</address>
</author>
<date month="February" year="2011"/>
<area>RAI</area>
<workgroup>ECRIT</workgroup>
<keyword>Internet-Draft</keyword>
<keyword>Civic</keyword>
<keyword>Location</keyword>
<keyword>Address</keyword>
<keyword>Boundary</keyword>
<abstract>
<t>Algorithms for decision-making based on civic address inputs are described. This includes an algorithm for determining whether one civic address is entirely contained within another. Other algorithms and supplementary discussions relating to the use of civic addresses in describing boundaries are included.
</t>
</abstract>
</front>
<middle>
<section anchor="intro" title="Introduction">
<t>Civic address information (<xref target="RFC4776"/>, <xref target="RFC5139"/>) can be used to describe the location of an entity in terms of the human-constructed environment. This description can be be made more or less precise through the addition or removal of labels (respectively).
</t>
<t>A less precise civic address can be used to describe a zone or region by only including sufficient labels to identify the region. This method is used in the <xref target="RFC5222">Location-to-Service Translation (LoST) protocol</xref>, to convey information to a client about the extents of a particular region where service is guaranteed to be consistent.
</t>
<t>This information, called a service boundary, allows a client to make decisions about civic addresses other than the one used to query. If another civic address is determined to be "within" the service boundary, the client does not need to request service information from the LoST server.
</t>
<t>LoST does not provide a definition of "within" for civic addresses. This document describes an algorithm that provides a definition for whether a civic address is contained within another address.
</t>
<t>Other operations on civic addresses are described, allowing client software to make decisions about the intersection and union of two civic addresses.
</t>
</section>
<section anchor="model" title="Civic Address Model">
<t>The simplest model of a civic address is that which considers it as an unordered set of labels. Each label is assigned zero or more values; each value has an associated language (and script).
</t>
<t>The format in <xref target="RFC4776">RFC 4776</xref> allows for values to be given to labels with different languages or scripts. No special considerations apply in applying this model.
</t>
<t>The format in <xref target="RFC5139">RFC 5139</xref> uses multiple <spanx style="verb">civicAddress</spanx> elements to form a single address if labels are provided in multiple languages. Thus, when extracting address information from a <xref target="RFC4119">PIDF-LO</xref> document, a civic address in this model is formed from all <spanx style="verb">civicAddress</spanx> elements in the same tuple.
</t>
<t>A civic address describes a series of spatial partitions or regions. Every address includes an implied partition that identifies the habited portion of the Earth. Every label with a value describes a partition of space at a specific scale. The intersection of the spaces described by all the included labels is the resulting location.
</t>
<t>The algorithm described in this document relies on the following rule:
<list style="empty">
<t>The location described by a set of civic address labels is entirely contained within the location described by any subset of those labels.
</t>
</list>
</t>
<t>The following characteristics of civic addresses have no bearing on the algorithms described:
<list style="numbers">
<t>The civic address formats of <xref target="RFC4776"/> and <xref target="RFC5139"/> include a limited set of hierarchical elements. The <spanx style="verb">country</spanx> and <spanx style="verb">A1</spanx> through <spanx style="verb">A6</spanx> labels follow a strict hierarchy. The algorithms described do not rely on this hierarchy.
</t>
<t>The physical region described by a civic address is not necessarily contiguous. For instance, an address might omit a thoroughfare name, but include a house number of 23. Such an address identifies every house at number 23 within the area described by other labels.
</t>
</list>
</t>
<t>More sophisticated models and algorithms are possible in the presence of additional information about the address data. If this sort of information is present, many more options are available for processing addresses. The simple algorithms in this document operate on the address information only, but do not preclude use of outside information.
</t>
</section>
<section anchor="boundary" title="Civic Address Boundaries">
<t>A civic address boundary has the same format as a civic address.
</t>
<t>A civic address boundary describes a region by containing fewer labels than the addresses of locations contained within the boundary. In that respect, the boundary might be considered an incomplete address, allthough a boundary is actually a valid civic address that simply describes a larger location.
</t>
<t>A larger region is described by including fewer labels; a smaller region might be described by including more labels, or labels that are more specific.
</t>
<t>For example, if the described region is the province of Zeeland in the Netherlands, only two labels are required: a country of <spanx style="verb">NL</spanx> and an A1 field of <spanx style="verb">ZE</spanx>.
</t>
<t>A label that is omitted from the civic address boundary indicates that civic addresses within the boundary may have any value for the label.
</t>
<t>This process does not provide any assurance that a civic address exists, only that if it does exist, it is contained entirely within the described boundary. Determining whether such an address actually exists usually requires additional information, and is therefore not considered by this document.
</t>
<section title="Determining if an Address is Within a Boundary">
<t>A civic address is entirely enclosed within a boundary if every label of the boundary that has a value has an equivalent value in the address. A civic address boundary can entirely enclose another civic address boundary.
</t>
<t>Case folding is performed on values before comparison.
</t>
<t>A label is considered equivalent if at least one value from the boundary has the same value in the address for the same language (and script). If values are provided in multiple languages, any language that is present in both boundary and address can be used.
</t>
<t>[[TBD: If the label contains different values for the same language, does this override the above - in light of the Austrian example below, it's probably better that the more lenient equivalence test is used.]]
</t>
<t>Labels that have the same value, but a different language, are not equivalent. Without information on different translations of the label, the label must be considered to be different.
</t>
</section>
<section anchor="algorithm" title="Algorithm Summary">
<t>The algorithm for determining whether an address is contained entirely within a given boundary can be summarized by the following pseudocode:
</t>
<figure>
<artwork><![CDATA[
SET iswithin = true
FOR EACH label IN boundary DO:
IF boundary[label] exists THEN:
SET equivalent = false
FOR lang IN boundary[label] DO:
IF boundary[label][lang] == address[label][lang] THEN:
SET equivalent = true
END
END
IF NOT equivalent THEN SET iswithin = false
END
END
RETURN iswithin
]]></artwork>
</figure>
</section>
<section title="False Negatives">
<t>This test can produce false negatives for a number of reasons:
<list style="numbers">
<t>A particular label might be specified with different languages in the boundary and the address. This label might be considered equivalent if the two values have the same meaning.
<vspace blankLines="1"/>
For instance, the German city Muenchen is known as Munich in English - knowledge of this translation is required to determine that these two values are equivalent.
</t>
<t>A label might have equivalent values, but subtly different <xref target="RFC5646">language tags</xref> that result in a failed comparison.
<vspace blankLines="1"/>
For instance, in many cases, a language tag of <spanx style="verb">en</spanx> is not significantly different from variants that use the same primary language subtag. Identical values with <spanx style="verb">en</spanx> and <spanx style="verb">en-US</spanx> or <spanx style="verb">en-Latn</spanx> would compare as different, even though the latter two tags are simply more specific than the first. Even <spanx style="verb">en-GB</spanx> is rarely different to these for text that is used in addressing.
<vspace blankLines="1"/>
For creators of civic addresses and boundaries, the guidance in <xref target="RFC5646"/>, Section 4.1 recommends that subtags are only added if they include useful distinguishing information. This is intended to avoid processing errors such as this. This guidance is particularly relevant in relation to use of CAtype 128 (script) in the binary encoding of <xref target="RFC4776"/>, which is often unnecessarily specified.
</t>
<t>The address might use different labels than the boundary to produce the same result.
<vspace blankLines="1"/>
For instance, a boundary might use labels <spanx style="verb">A1</spanx> through <spanx style="verb">A6</spanx> to describe a location, whereas the same location is described using a postal code in place of these elements. The address is within the described boundary, but this cannot be determined without knowing that the postal code and A-labels are equivalent.
<vspace blankLines="1"/>
In some countries, specific address codes can be used to replace some or all of the other address labels. In some instances, an address consisting of the country and the <spanx style="verb">ADDCODE</spanx> label can be sufficiently descriptive for an application; however, this would not be identified as being within a boundary that was specified using other address labels.
</t>
<t>There are many cases where a value can be expressed in different ways. This includes abbreviations, commonly accepted misspellings, and generally recognized variations in addresses.
<vspace blankLines="1"/>
For instance, abbreviations are common for thoroughfare suffixes, like <spanx style="verb">Street</spanx> (<spanx style="verb">St.</spanx> or <spanx style="verb">St</spanx>) or <spanx style="verb">Road</spanx> (<spanx style="verb">Rd.</spanx> or <spanx style="verb">Rd</spanx>).
<vspace blankLines="1"/>
In another example, the <xref target="RFC5774">Austrian addressing recommendations</xref> let certain labels contain either a code or a descriptive name. Without knowing that <spanx style="verb">Oberbaumgarten</spanx> and <spanx style="verb">Oberbaumgarten;1208</spanx> refer to the same Katastralgemeindenamen, these values must be considered to be different.
</t>
</list>
By using additional information, a system might be able to identify more equivalent labels than the basic algorithm. This can remove some, if not all such false negatives. However, a system should not rely on another system having and employing such knowledge.
</t>
</section>
<section title="False Positives">
<t>This algorithm guarantees, that a civic address that exists is entirely contained within a boundary.
</t>
<t>No allowance is made for addresses that do no exist. It is trivially possible to construct a non-sensical or non-existent civic address that is considered "within" a boundary using this algorithm. This can be done by starting with the civic address boundary and adding arbitrary values to labels that do not already have values.
</t>
</section>
<section title="Address Boundary Limitations">
<t>The address format allows a limited expression for address boundaries. This representation can only be used in limited applications. This simple boundary expression is not suitable for any application that is sensitive to false negatives.
</t>
<t>In the case of <xref target="I-D.ietf-ecrit-lost-sync">boundary interchange between LoST servers</xref> would likely require multiple specific boundaries to describe a single boundary. A number of well-known cases would generate a very large number of such boundaries. For instance, if a boundary runs up the middle of street that places odd and even house numbers on opposite sides of the street, each house on that street would require an individual address.
</t>
<t>Concatenation of address data can introduce other limitations. The limited set of address labels can mean that each field can hold several discrete units of data. An <xref target="RFC5774">address mapping</xref> might specify that underlying data be concatenated and mapped to a single label.
</t>
<t>The algorithm described here operates on entire labels only. The algorithm and boundary expression does not allow only part of a label to change. If concatenated data is included in the one label, a generic processer cannot know of this and distinguish between parts that must match and parts that do not matter. To do so would require knowledge of the modes of concatenation, what delimiters were used and it would require syntax that distinguishes important parts from unimportant parts.
</t>
</section>
</section>
<section title="Boundary Combining Algorithms">
<t>In some cases, it might be necessary to combine boundaries. This section describes three algorithms, including simple union and intersection.
</t>
<section title="Boundary Unions">
<t>The union of two civic address boundaries (or addresses) is a single boundary that contains all civic addresses that are contained within either original boundary.
</t>
<t>The boundary that forms the union of two other boundaries is formed of all labels that are equivalent in both boundaries.
</t>
<t>If labels differ in the two boundaries, then the resulting union might also include addresses that are in neither boundary. Only use this algorithm if false positives are acceptable.
</t>
<figure><artwork><![CDATA[
SET union = empty address
FOR EACH label IN boundary1,boundary2 DO:
IF boundary1[label] EQUIV boundary2[label] THEN
SET union[label] = boundary1[label]
END
END
RETURN union
]]></artwork></figure>
<t>The value of any label in a union of boundaries should include a value for all languages that are present in both boundaries.
</t>
</section>
<section title="Boundary Intersections">
<t>The intersection of two boundaries (or addresses) is a single boundary that contains all addresses that are found in both original boundaries.
</t>
<t>The boundary that forms the intersection of two other boundaries is formed of the combined value of the labels from both boundaries.
</t>
<t>If the value of any label is present in both boundaries, but not equivalent, the two boundaries do not intersect for the purposes of this algorithm. In practice, while it is possible that two such boundaries could overlap, this algorithm cannot detect this. Furthermore, the civic address representation does not provide a way to express such an overlap.
</t>
<figure><artwork><![CDATA[
SET intersection = empty address
FOR EACH label IN boundary1,boundary2 DO:
IF boundary1[label] exists AND boundary2[label] exists
AND boundary1[label] NOT EQUIV boundary2[label] THEN
ERROR No overlap
END
IF boundary1[label] exists THEN:
SET intersection[label] = boundary1[label]
ELSE
SET intersection[label] = boundary2[label]
END
END
RETURN intersection
]]></artwork></figure>
<t>The value of any label in an intersection of boundaries may include a value for all languages that are present in both boundaries.
</t>
</section>
<section title="Avoiding False Positives">
<t>The service boundaries in <xref target="RFC5222">LoST</xref> rely on the absence of false positives when determining if an address is within a boundary. False negatives are tolerated, because this only results in the LoST client making another request to discover an unchanged service. A false positive is not desirable because a device could retain a service mapping that is likely to be invalid.
</t>
<t><xref target="I-D.ietf-ecrit-rough-loc"/> describes an algorithm where an address is formed of the intersection of multiple boundaries. If no intersection the result of this algorithm was failure, then usable location information cannot be provided to the LoST client.
</t>
<t>For a LoST service boundary, the goal is to provide a boundary that contains as few labels as possible. For a rough location, the goal is to provide an address with as few labels as possible. False positives are not desirable in either case.
</t>
<t>Thus, both cases have a similar goal, and both have a more precise address to operate on. In LoST, this is the address used to query the server; for rough locations, this is the address that is to be hidden.
</t>
<t>To avoid false positives in determining whether an address falls within a boundary, labels from the more precise address are added. Any label where the inputs to the union or intersection disagree is given a value from the precise address. This ensures that the resulting address or boundary entirely contains the precise address.
</t>
<t>Alternatively, given a precise address and a number of boundaries that are to be combined by either union or intersection, an address or boundary can be formed by removing all labels from the precise address that do not have a value in any boundary. This algorithm produces an identical result.
</t>
<figure><artwork><![CDATA[
SET result = precise address
FOR EACH label IN result DO:
SET found = false
FOR EACH boundary IN boundaries DO:
IF boundary[label] exists THEN SET found = true
END
IF not found THEN CLEAR result[label]
END
RETURN result
]]></artwork></figure>
</section>
</section>
<section title="Example">
<t>The following civic address boundary (shown in <xref target="RFC5139">XML form</xref>), describes a region of Los Angeles, USA.</t>
<figure><artwork><![CDATA[
<civicAddress
xmlns="urn:ietf:params:xml:ns:pidf:geopriv10:civicAddr">
<country>US</country>
<A1>CA</A1>
<A2>Orange</A2>
<A3>Los Angeles</A3>
<A4>Anaheim</A4>
<PC>92802</PC>
</civicAddress>
]]></artwork></figure>
<t>The following address includes additional labels, it does not change the value of any label.
</t>
<figure><artwork><![CDATA[
<civicAddress
xmlns="urn:ietf:params:xml:ns:pidf:geopriv10:civicAddr">
<country>US</country>
<A1>CA</A1>
<A2> Orange </A2>
<A3>Los Angeles</A3>
<A4>Anaheim</A4>
<PRD>South</PRD><RD>Harbor</RD><STS>Boulevard</STS>
<HNO>1313</HNO><NAM>Disneyland</NAM>
<PC>92802</PC>
</civicAddress>
]]></artwork></figure>
<t>An equivalent address that would not be considered "within" the boundary by the generic algorithm would be one that specified <spanx style="verb">Orange County</spanx> for the <spanx style="verb">A2</spanx> label. It's also possible to unambiguously describe the location without fields <spanx style="verb">A1</spanx> through <spanx style="verb">A4</spanx>, since a ZIP (postal) code of <spanx style="verb">92802</spanx> provides sufficient information.
</t>
<t>Thus, the following variant is not considered "within" the boundary, even if it is the same address. That is, without context-specific knowledge, the algorithm produces a false negative on this address.</t>
<figure><artwork><![CDATA[
<civicAddress
xmlns="urn:ietf:params:xml:ns:pidf:geopriv10:civicAddr">
<country>US</country>
<PRD>South</PRD><RD>Harbor</RD><STS>Boulevard</STS>
<HNO>1313</HNO><NAM>Disneyland</NAM>
<PC>92802</PC>
</civicAddress>
]]></artwork></figure>
</section>
<section anchor="iana" title="IANA Considerations">
<t>This document has no IANA actions.</t>
<t>[[RFC Editor: please remove this section prior to publication.]]</t>
</section>
<section anchor="security" title="Security Considerations">
<t>This document describes a civic address model and algorithms for manipulating civic addresses and boundaries in the same format. There are no known security considerations arising from the described application of these algorithms.
</t>
</section>
<!--
<section title="Acknowledgements">
<t>
</t>
</section>
<appendix title="Change Log">
<t>[[The RFC Editor is requested to remove this section at publication.]]</t>
<t>Changes since -0-1:
<list style="symbols">
<t>Document created.</t>
</list>
</t>
</appendix>
-->
</middle>
<back>
<!--
<references title="Normative References">
&RFC3688;
</references>
-->
<references title="Informative References">
&RFC4119;
&RFC4776;
&RFC5139;
&RFC5222;
&RFC5646;
&RFC5774;
&I-D.ietf-ecrit-rough-loc;
&I-D.ietf-ecrit-lost-sync;
</references>
</back>
</rfc>
| PAFTECH AB 2003-2026 | 2026-04-24 02:58:12 |