Skip to content

Latest commit

 

History

History
150 lines (100 loc) · 8.47 KB

existing-standards.md

File metadata and controls

150 lines (100 loc) · 8.47 KB

existing standards

TIGER

tiger

The US census bureau have produced an export of street centerline coverage in the United States, Puerto Rica and the Island Areas since 1989, new versions of the data are released periodically on their website.

The TIGER/Line files contain a schema for describing address ranges:

5.12.4 Address Ranges

Linear address range features and attributes are available in the following layer: 'Address Range Feature County-based Shapefile'

The address range feature shapefile contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure house numbers between the first structure house number to the last structure house number of a specified parity along an edge side relative to the direction in which the edge is coded. All of the TIGER/Line address range files contain potential address ranges, not individual addresses. Potential ranges include the full range of possible structure numbers even though the actual structures may not exist. Single-address address ranges are suppressed to maintain the confidentiality of the addresses they describe.

The most relevant properties include (but may not be limited to):

property description
LFROMHN From House Number associated with the address range on the left side of the edge; SIDE=L
LTOHN To House Number associated with the address range on the left side of the edge; SIDE=L
RFROMHN From House Number associated with the address range on the left side of the edge; SIDE=R
RTOHN To House Number associated with the address range on the left side of the edge; SIDE=R
PARITYL Left side Address Range Parity
PARITYR Right side Address Range Parity

For reference, this is how they define terms in their glossary:

4.1 Edge

A linear object (topological primitive) that extends from a designated start node (From node) and continues to an end node (To node). An edge’s geometry can be described by the coordinates of its two nodes, plus possible additional coordinates that are ordered and serve as vertices (or "shape" points) between these nodes. The order of the nodes determines the From-To orientation and left/right sides of the edge. Each edge is uniquely identified by a TLID. A TLID is defined as a permanent edge identifier that never changes. If the edge is split, merged or deleted its TLID is retired.

4.10 Parity

Parity is an attribute field in the addrfeat.shp used to indicate whether address house numbers within an address range are Odd (O), Even (E), or Both (B) (both odd and even).

note: for a full list of properties see page 79 of the TIGER technical documentation.


Openstreetmap

osm

A failed import of TIGER@2005 was attempted during 2005/2006, it was aborted and the data purged due to data integrity problems.

The first successful import was made in 2007/2008 using a some new ruby scripts, there is still some confusion around whether the importer Dave Hansen imported TIGER@2005 or TIGER@2006.

The TIGER import still has a long list of issues and the wiki clearly says "It is unlikely that the TIGER data ever will be imported again."

The TIGER import laid the foundations for the US road network in OSM and due to its lineage; inherited some of the TIGER/Line attributes mentioned above, a discussion of how those properties were mapped from one dataset to another can be found in the OSM wiki

===

tiger:*

For a full list of tiger tags in use see: taginfo, the most common metadata tags are:

total tag description
~13M tiger:cfcc Census Feature Classification Code (CFCC). replaced by MTFCC in 2007
~13M tiger:county Name of County followed by the abbreviated state name
~12M tiger:reviewed no was set on all ways in the TIGER import. It is intended as a way of tracking TIGER fixup progress. However it does not succeed in this aim and is largely pointless
~6M tiger:tlid TIGER permanent edge ID

Additionally there are tags which provide a more granular representation of street names:

OSM Key TIGER Field Example
tiger:name "#{fedirp} #{fename} #{fetype} #{fedirs}".strip "NW Chester St S"
tiger:name_direction_prefix fedirp "NW", "Southwest"
tiger:name_base fename "Chester"
tiger:name_type fetype "Street", "Ave"
tiger:name_direction_suffix fedirs "S"

And tags which represent address ranges:

OSM Key TIGER Field
from_address_right fraddr
to_address_right toaddr
from_address_left fraddl
to_address_left toaddl
zip_left zipl
zip_right zipr

For each additional address range:

OSM Key TIGER Field
from_address_right_1 fraddr
to_address_right_1 toaddr
from_address_left_1 fraddl
to_address_left_1 toaddl
zip_left_1 zipl
zip_right_1 zipr

et cetera

note: there are only ~1600 occurrences of from_address_right and from_address_left in OSM. see taginfo.

===

addr:interpolation:*

This is a more global/general approach towards defining house number ranges in OSM.

The wiki states that the tag value should be a numeric offset which indicates "the value to increment between house numbers" (so generally 2 in western countries which follow the 'zigzag' schema).

There are also special cases where the tag value is either the string odd or even (there is also all but it is not clear what this means; presumably both).

In this case odd is equivalent to having an offset 2 but also implies that the range starts at an odd number. The inverse is true for even.

For a full list of addr:interpolation tags in use see: taginfo, the most common tag values are:

total tag
~1M addr:interpolation=even
~1M addr:interpolation=odd
~18k addr:interpolation=all
~1.6k addr:interpolation=alphabetic
~200 addr:interpolation=4
~150 addr:interpolation=1
~100 addr:interpolation=2

It's natural to assume that the addr:interpolation tag is commonly added to ways which also contain a highway:* tag, looking at taginfo shows us that only 643 of the ~2M interpolation tags actually belong to road network segments.

The majority of addr:interpolation tags are 'invisible ways' which represent the path along which the houses will sit, on the base map they look like this:

basemap interpolation

In the example above the mapper has entered the first node and last note of the sequence, they tagged each with the addr:street and addr:housenumber tags and then joined the two with a way which contains a single tag addr:interpolation:odd, the way is represented by a dashed line.

This seems to be the most common use of the tag, in some cases the mapper is simply trying to cover a large amount of ground quickly, maybe they got tired of drawing the individual building outline for each house or maybe the area is still under construction, the wiki contains some additional tags for defining the confidence of the survey.

In some areas the interpolation values have been replaced by individual building outlines, in other areas it's more common, such as:

basemap interpolation

I would be interested in doing more research in to the amount of house numbers available in OSM using addr:street and addr:housenumber vs. using addr:interpolation:*. It would take some time to compute but would make a great blog post.