PostGIS is a "geospatial database" and the "geo" in "geospatial" is an important qualifier: it means that all the coordinates in PostGIS point, lines, and polygons can be located somewhere on the earth.
Because it is (mostly) spherical, we don't use cartesian coordinates (x, y) to describe locations on the earth, we use spherical coordinates. If you were ever introduced to spherical coordinates in math class, you would know the two coordinates as "theta" and "phi". If you were introduced to them in an astronomy class, you might know them as "bearing" and "elevation". If you are talking to a geographer, they will use the terms "longitude" and "latitude".
The PostGIS "geography" type is a geospatial type that understands coordinates as spherical coordinates, in latitude and longitude.
Here is a small table of world airports, loaded into a geography column.
-- Airports table CREATE TABLE airports ( code text primary key, geog geography(point) ); -- Los Angeles INSERT INTO airports (code, geog) VALUES ('LAX', 'POINT(-118.4079 33.9434)'); -- Paris INSERT INTO airports (code, geog) VALUES ('CDG', 'POINT(2.5559 49.0083)'); -- Tokyo INSERT INTO airports (code, geog) VALUES ('NRT', 'POINT(139.733 35.567)'); -- New York INSERT INTO airports (code, geog) VALUES ('JFK', 'POINT(-73.77813 40.641311)'); -- Spatial index on geography CREATE INDEX airports_geog_x ON airports USING GIST (geog);
We can calculate distances between airports with a simple self-join:
SELECT a.code AS orig, b.code AS dest, ST_Distance(a.geog, b.geog) FROM airports a CROSS JOIN airports b WHERE a.code = 'LAX';
The result might look a little funny, because the distances are in meters:
orig | dest | st_distance ------+------+------------------ LAX | LAX | 0 LAX | CDG | 9124665.27317673 LAX | NRT | 8833954.77277118 LAX | KEF | 6941795.61535381
Suppose I run a flight from LAX (Los Angeles) to CDG (Paris), does it make economic sense to stop over in New York?
Well, what's the shortest distance between the flight and JFK (New York) airport?
-- LAX-CGD route line WITH lax_cdg AS ( SELECT ST_MakeLine(a.geog::geometry, b.geog::geometry)::geography AS geog FROM airports a, airports b WHERE a.code = 'LAX' and b.code = 'CDG' ) -- Distance from route line to JFK SELECT ST_Distance(a.geog, lax_cdg.geog) AS distance FROM airports a, lax_cdg WHERE a.code = 'JFK';
1962 kilometers! Quite a long way. But wait a minute, the route looked really close on the map. Is the system wrong?
No, I was just fooling with the map. The system is correct, the map is wrong. On the sphere, the shortest distance between any two points is a great circle. The straight line on the map wasn't reflective of the actual route.
Incidentally, this kind of problem is one that people working with geography grapple with frequently: our intuition about places and routes and shapes in the world is strongly conditioned to a use a cartesian, flat view of the world.
So, should you use the "geography" type all the time?
The benefits of geography we have already shown:
- It is a global coordinate space.
- It supports longitude/latitude coordinates without any special code.
- It indexes and calculates over the dateline and poles automatically.
The drawbacks of geography are visible in our final query:
- Most PostGIS functions are built to use the planar "geometry" type, and you will have to cast your data to use them.
- Geography calculations on the sphere are quite expensive, for big queries you'll find a noticeable performance penalty.
In general, if your working area is geographically small enough (a state, a region, a country) to fit in a projected system, using a projected system and the "geometry" type for all your calculations will result in simpler SQL and faster queries.
But if you have global data, you'll find your system is simpler to code and reason about if you use the "geography" type.
You can learn even more about the geography data type through this interactive course.
August 4, 2020 •More by this author