Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Latest posts from Paul Ramsey

  • XKCD Bad Map Projection with PostGIS

    Paul Ramsey

    Last week, Randall Munroe dropped his latest XKCD "Bad Map Projection", number six, "ABS(Longitude)", which looks like this:

    ABS(Longitude)

    Truly this is a bad map projection, on a par with the previous five:

    Read More
  • Remote Access Anything from Postgres

    Paul Ramsey

    In my last blog post, I showed four ways to access a remotely hosted CSV file from inside PostgreSQL:

    • Using the COPY command with the PROGRAM option,
    • Using the http extension
    Read More
  • Holy Sheet! Remote Access CSV Files from Postgres

    Paul Ramsey

    An extremely common problem in fast-moving data architectures is providing a way to feed ad hoc user data into an existing analytical data system.

    Do you have time to whip up a web app? No! You have a database to feed, and events are spiraling out of control... what to do?

    How about a Google Sheet? The data layout is obvious, you can even enforce things like data types and required columns using locking and protecting, and unlike an Excel or LibreOffice document, it's always online, so you can hook the data into your system directly.

    Access Sheets Data Remotely

    Read More
  • Tags and Postgres Arrays, a Purrrfect Combination

    Paul Ramsey

    In a previous life, I worked on a CRM system that really loved the idea of tags. Everything could be tagged, users could create new tags, tags were a key organizing principle of searching and filtering.

    The trouble was, modeled traditionally, tags can really make for some ugly tables and equally ugly queries. Fortunately, and as usual, Postgres has an answer.

    Today I’m going to walk through working with tags in Postgres with a sample database of 🐈 cats and their attributes

    • First, I’ll look at a traditional relational model
    • Second, I’ll look at using an integer array to store tags
    • Lastly, I’ll test text arrays directly embedding the tags alongside the feline information
    Read More
  • Easy PostgreSQL Time Bins

    Paul Ramsey

    It's the easiest thing in the world to put a timestamp on a column and track when events like new records or recent changes happen, but what about reporting?

    Binning data for large data sets like time series is a great way to let you group data sets by obvious groups and then use SQL to pull out a query that easily works in a graph.

    Here's some PostgreSQL secrets that you can use to build up complete reports of time-based data.

    Earthquake Data

    Read More
  • Postgres Raster Query Basics

    Paul Ramsey

    In geospatial terminology, a "raster" is a cover of an area divided into a uniform gridding, with one or more values assigned to each grid cell.

    rows and columns showing how a raster works with a pixel

    A "raster" in which the values are associated with red, green and blue bands might be a visual image. The rasters that come off the Landsat 7

    Read More
  • SQL Tricks for More Effective CRUD

    Paul Ramsey

    Over and over when I look at applications for performance, the lesson I learn and re-learn is, do more things right inside the database.

    What is CRUD?

    Create, read, update, delete! All the things you do to a table or collection of tables to work with your ever-changing data.

    Most CRUD examples, and most CRUD thinking, tend to focus on one table at a time. That's easy to understand. It's also unrealistic. Even the simplest application will be working with several interlinked normalized tables.

    Here's our working example tables.

    sample schema

    Hands On Tutorial

    Read More
  • Postgres Strings to Arrays and Back Again

    Paul Ramsey

    One of my favourite (in an ironic sense) data formats is the "CSV in the CSV", a CSV file in which one or more of the column is itself structured as CSV.

    Putting CSV-formatted columns in your CSV file is a low tech approach to shipping a multi-table relational data structure in a single file. The file can be read by anything that can read CSV (which is everything?) and ships around the related data in a very readable form.

    Station North,"-1,-4,-14,-15,-16,-15,-12,-9,-3,0,1,2"
    Station West,"2,4,5,6,9,10,15,16,13,12,10,9,5,3,1"
    Station East,"5,3,2,4,5,6,9,10,15,16,13,12,10,9,5,4,2,1"
    Station South,"12,18,22,25,29,30,33,31,30,29,28,25,24,23,14"
    
    Read More
  • Percentage Calculations Using Postgres Window Functions

    Paul Ramsey

    Back when I first learned SQL, calculating percentages over a set of individual contributions was an ungainly business:

    • First calculate the denominator of the percentage,
    • Then join that denominator back to the original table to calculate the percentage.

    This requires two passes of the table: once for the denominator and once for the percentage. For BI queries over large tables (that is, for most BI queries) more passes over the table slow performance significantly.

    Also, the SQL was really ugly!

    With modern PostgreSQL, you can calculate complex percentages over different groups in a single pass

    Read More
  • Postgres Insider Terminology

    Paul Ramsey

    Last week Craig Kerstiens published a great introduction to Postgres terminology, covering some of the basics you might run into when just getting started.

    The funny thing about jargon is how quickly we get used to it, and forget we are even using it. It becomes part of the secret handshake, the way we signal to other members of our tribe that we're part of the group.

    When I first started going to Postgres conferences and listening to talks by Postgres core developers I suddenly found myself at sea. What were these strange words and phrases they were using?

    It turns out that a lot of them are taken from the Postgres code base, which in turn naturally uses them because they are part of Codd's relational model

    Read More