Geocoding
An Introduction
Prof. David Bernstein
James Madison University
Computer Science Department
bernstdh@jmu.edu
Definitions
Geocoding:
The process of finding the coordinates of an object on the surface of the Earth (or other celestial body)
Digitizing:
The process of creating an electronic/digital representation (usually a vector representation) of a geometric shape
References Commonly Used in Geocoding
Street addresses
Telephone numbers (for land-line telephones)
Internet Protocol (IP) Addresses (for static addresses)
Geocoding Street Addresses
(Courtesy of
xkcd
)
Geocoding Street Addresses (cont.)
One Approach:
Digitize street intersections
Determine the collection of addresses on each segment
Interpolate individual addresses
Collections of Addresses:
Actual low and high address
Potential low and high address
List of actual addresses
Geocoding Street Addresses (cont.)
Using Potential Addresses and Directed Segments
Common Problems when Geocoding Street Addresses
Digitizing Problems:
Intersections with incorrect coordinates
Address Collection Problems:
Missing collections
Gaps in ranges
Confusion about odd and even sides
Duplicate names
Interpolation Problems:
Words for numbers (e.g., "One" for "1")
Spelling mistakes/differences
"Route number" and name for a single segment
Multiple names for a single segment
Abbreviations (e.g., "Port" for "Port Republic", "3rd" for "third")
Geocoding (Land-Line) Telephone Numbers
Direct Approach:
Digitize telephone poles or conduits or street intersections
Determine the collection of phone numbers on each segment
Interpolate individual telephone numbers
Indirect Approach:
Associate each phone number with a street address
Geocode the street address
Geocoding (Static) Internet Addresses
One Approach:
Collect the coordinates for each IP address
Another Approach:
Collect the coordinates of "large" routers/servers
Collect the set of "nearby" IP addresses
Interpolate individual IP addresses
There's Always More to Learn