PostGIS Performance: Indexing and EXPLAIN

Paul Ramsey

3 min readMore by this author

I am kicking off a short blog series on PostGIS performance fundamentals. For this first example, we will cover fundamental indexing.

We will explore performance using the Natural Earth “admin0” (countries) data (258 polygons) and their “populated places” (7342 points).

A classic spatial query is the “spatial join”, finding the relationships between objects using a spatial contain.

“How many populated places are there within each country?”

SELECT Count(*), a.name
FROM admin0 a
JOIN popplaces p
  ON ST_Intersects(a.geom, p.geom)
GROUP BY a.name ORDER BY 1 DESC;

This returns an answer, but it takes 2200 milliseconds! For two such small tables, that seems like a long time. Why?

The first stop in any performance evaluation should be the “EXPLAIN” command, which returns a detailed explanation of how the query is executed by the database.

EXPLAIN SELECT Count(*), a.name
FROM admin0 a
JOIN popplaces p
  ON ST_Intersects(a.geom, p.geom)
GROUP BY a.name;

Explain output looks complicated, but a good practice is to start from the middle (the most deeply nested) and work your way out.

                              QUERY PLAN
-------------------------------------------------------------------------
 GroupAggregate  (cost=23702129.78..23702145.38 rows=258 width=18)
   Group Key: a.name
   ->  Sort  (cost=23702129.78..23702134.12 rows=1737 width=10)
         Sort Key: a.name
         ->  Nested Loop  (cost=0.00..23702036.30 rows=1737 width=10)
               Join Filter: st_intersects(a.geom, p.geom)
               ->  Seq Scan on admin0 a  (cost=0.00..98.58 rows=258 width=34320)
               ->  Materialize  (cost=0.00..328.13 rows=7342 width=32)
                     ->  Seq Scan on popplaces p  (cost=0.00..291.42 rows=7342 width=32)

The query plan includes a minimum and maximum potential cost for each step in the plan. Steps with large differences are potential bottlenecks. Our bottleneck is in the “nested loop” join, which is performing the spatial join.

  • For each geometry in the admin0 table:
    • Check every geometry in the popplaces table
      • If it passes the join filter, keep it in the join

This pattern of checking every potential intersection is a lot of work, even for our small tables. For 258 countries and 7342 places, it runs 1.8 million intersection tests!

Just as for a non-spatial join, the key to making this query efficient is adding an index. In this case, an index on the populated places geometry.

CREATE INDEX popplaces_geom_x ON popplaces USING GIST (geom);

The PostGIS spatial index is implemented as an “r-tree” using the GIST “access method”. The “r-tree” index algorithm is auto-tuning, so you do not need to fiddle with parameters to get the best index for your data.

Important! Do not forget to specify the GIST access method with the USING GIST keywords in your index creation. If you leave them out, you will build a default PostgreSQL b-tree index instead, and that will provide no speed-up at all for your spatial join.

Running with the index in place, the SQL is exactly the same.

SELECT Count(*), a.name
FROM admin0 a
JOIN popplaces p
  ON ST_Intersects(a.geom, p.geom)
GROUP BY a.name;

But now it takes 200 milliseconds, 10 times faster! Why? The SQL has not changed, but thanks to the index, the query plan has changed.

                                QUERY PLAN
-----------------------------------------------------------------------------
 HashAggregate  (cost=4185.41..4187.99 rows=258 width=18)
   Group Key: a.name
   ->  Nested Loop  (cost=0.15..4176.73 rows=1737 width=10)
         ->  Seq Scan on admin0 a  (cost=0.00..98.58 rows=258 width=34320)
         ->  Index Scan using popplaces_geom_x on popplaces p  (cost=0.15..15.80 rows=1 width=32)
               Index Cond: (geom && a.geom)
               Filter: st_intersects(a.geom, geom)

The join is still a nested loop on the admin0 geometry, but instead of a sequence scan on the populated places, costing as much as 300, the inner loop is an index scan, costing only 15. As much as 20 times cheaper, resulting in our overall 10 times faster query time.



Join us this year on November 20 for PostGIS Day 2025, a free, virtual, community event about open source geospatial!