Geocoding with Web APIs in Postgres

Jacob Coblentz

3 min read

Geocoding is the process of taking addresses or location information and getting the coordinates for that location. Anytime you route a new location or look up a zip code, the back end is geocoding the location and then using the geocode inside other PostGIS functions to give you the routes, locations, and other data you asked for.

PostGIS comes equipped with an easy way to use the US Census data with the Tiger geocoder. Using the Tiger geocoder requires downloading large amounts of census data and in space-limited databases, this may not be ideal. Using a geocoding web API service can be a space saving solution in these cases.

I am going to show you how to set up a really quick function using plpython3u to hit a web service geocoder every time that we get a new row in the database.

Installing plpython3u

The plpython3u extension comes with Crunchy Bridge or you can add it to your database. To get started run the following:

CREATE EXTENSION  plpython3u;

Creating a function to geocode addresses

In this example, I'll use the US census geocoding API as our web service, and build a function to geocode addresses based on that.

The function puts together parameters to hit the census geocoding API and then parses the resulting object, and returns a geometry:

CREATE OR REPLACE FUNCTION geocode(address text)
RETURNS geometry
AS $$
	import requests
	try:
		payload = {'address' : address , 'benchmark' : 2020, 'format' : 'json'}
		base_geocode = 'https://geocoding.geo.census.gov/geocoder/locations/onelineaddress'
		r = requests.get(base_geocode, params = payload)
		coords = r.json()['result']['addressMatches'][0]['coordinates']
		lon = coords['x']
		lat = coords['y']
		geom = f'SRID=4326;POINT({lon} {lat})'
	except Exception as e:
		plpy.notice(f'address failed: {address}')
		plpy.notice(f'error: {e.message}')
		geom = None
	return geom
$$
LANGUAGE 'plpython3u';

Using this function to geocode Crunchy Data's headquarters:

SELECT ST_AsText(geocode('162 Seven Farms Drive Charleston, SC 29492'));

Deploying this function for new data

But what if we want to automatically run this every time an address is inserted into a table? Let's say we have a table with a field ID, an address, and a point that we want to auto-populate on inserts.

CREATE TABLE addresses (
	fid SERIAL PRIMARY KEY,
	address VARCHAR,
	geom GEOMETRY(POINT, 4326)
);

We can make use of a Postgres trigger to add the geocode before every insert! Triggers are a very powerful way to leverage built in functions to automatically transform your data as it enters the database, and this particular case is a great demo for them!

CREATE OR REPLACE FUNCTION add_geocode()
RETURNS trigger AS
$$
DECLARE
    loc geometry;
BEGIN
    loc := geocode(NEW.address);
    NEW.geom = loc;
    RETURN NEW;
END;
$$
LANGUAGE plpgsql;
CREATE TRIGGER update_geocode BEFORE INSERT ON addresses
    FOR EACH ROW EXECUTE FUNCTION add_geocode();

Now when running an insert, the value is automatically geocoded!

INSERT INTO addresses(address) VALUES ('415 Mission St, San Francisco, CA 94105');

postgres=# SELECT fid, address, ST_AsText(geom) FROM addresses;
 fid |                 address                 |                        geom
-----+-----------------------------------------+----------------------------------------------------
   1 | 415 Mission St, San Francisco, CA 94105 | 0101000020E610000097CD0E2B66995EC0BB004B2729E54240

Summary

If you’re space limited, using a web API based geocoder might be the way to go. Using a geocoder function with triggers on new row inserts will get you geocoded addresses in a snap.

Avatar for Jacob Coblentz

Written by

Jacob Coblentz

March 2, 2023 More by this author