Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Marco Slot
Marco Slot
We are excited to release Crunchy Data Warehouse, a modern data warehouse for Postgres. Crunchy Data Warehouse combines Postgres with Iceberg, Parquet, and data lake formats for fast analytics queries and cost efficient storage.
Craig Kerstiens
Craig Kerstiens
Citus is in a small class of the most advanced Postgres extensions that exist. While there are many Postgres extensions out there, few have as many hooks into Postgres or change the storage and query behavior in such a dramatic way. Most that come to Citus have very wrong assumptions. Citus turns Postgres into a sharded, distributed, horizontally scalable database (that's a mouthful), but it does so for very specific purposes. Citus, in general, is fit for these type of applications and only the...
Read MorePaul Ramsey
Paul Ramsey
What's your favourite infinite sequence of non-repeating digits? There are some people who make a case for e , but to my mind nothing beats the transcendental and curvy utility of π, the ratio of a circle's circumference to its diameter. Drawing circles is a simple thing to do in PostGIS -- take a point, and buffer it. The result is circular, and we can calculate an estimate of pi just by measuring the perimeter of the unit circle. Except, look a little more closely -- this "circle" seems to...
Read MoreAykut Bozkurt
Aykut Bozkurt
Seamless integration of dbt with Crunchy Data Warehouse automates data movement between Postgres and Apache Iceberg. dbt’s modular SQL approach, combined with Iceberg’s scalable storage, and Postgres’ query engine means you can build fast, efficient, and reliable analytics—with minimal complexity. Today let’s dig into an example of using dbt with Postgres and Iceberg. The steps will be: 1. Set up Iceberg tables in Crunchy Data Warehouse using real-world real-time data from GitHub events 2. Confi...
Read MoreElizabeth Christensen
Elizabeth Christensen
Working on big data loads and or data type changes can be tricky - especially finding and correcting individual errors across a large data set. Postgres versions, 16, 17, and newer have a new function to help with data validation: . is a sql function that can be queried that will determine if a given input can be parsed into a specific type like numeric, date, JSON, etc. Here’s a super basic query to ask if ‘123’ is a valid integer. This function gives a t-true and f-false response. So if I as...
Read MoreMarco Slot
Marco Slot
PostgreSQL is commonly used to store event data coming from various kinds of devices. The data often arrives as individual events or small batches, which requires an operational database to capture. Features like time partitioning help optimize the storage layout for time range filtering and efficient deletion of old data. The PostgreSQL feature set gives you a lot of flexibility for handling a variety of IoT scenarios, but there are certain scenarios for it is less suitable, namely: • Long-te...
Read MoreBrian Pace
Brian Pace
Postgres' ability to execute queries in parallel is a powerful feature that can significantly improve query performance, especially on large datasets. However, like all resources, parallel workers are finite. When there aren't enough available workers, Postgres may downgrade a parallel query to a serial (non-parallel) execution. This sounds reasonable unless the performance of the downgraded query is well beyond the required response times needed by the application. While helping our clients w...
Read MorePaul Ramsey
Paul Ramsey
With the extension, it is possible to access gigabytes of raster data from the cloud, without ever downloading the data . How? The venerable extension (released 13 years ago ) already has the critical core support built-in! Rasters can be stored inside the database, or outside the database, on a local file system or anywhere it can be accessed by the underlying GDAL raster support library. The storage options include S3, Azure, Google, Alibaba, and any HTTP server that supports RANG...
Read MoreElizabeth Christensen
Elizabeth Christensen
Materialized views are widely used in Postgres today. Many of us are working with using connected systems through foreign data wrappers, separate analytics systems like data warehouses , and merging data from different locations with Postgres queries. Materialized views let you precompile a query or partial table, for both local and remote data. Materialized views are static and have to be refreshed. One of the things that can be really important for using materialized views efficiently is inde...
Read MoreGreg Sabino Mullane
Greg Sabino Mullane
It is rare that a Postgres table keeps the exact same structure year after year. New columns get added. Old columns get dropped. Column data types need to change. Those are all done with the ALTER TABLE command. One big drawback to these changes is that they may force a complete table rewrite. A rewrite means a completely new copy of the table is created, and then the old one is dropped. This can take a very long time for large tables. Worse, everything else is blocked/locked from using the tabl...
Read MoreKaren Jex
Karen Jex
Your database is configured for the needs of your day-to-day OLTP (online transaction processing) application workload, but what if you need to run analytics queries against your application data? How can you do that without compromising the performance of your application? Application data gradually builds up in your database over time, and at some point the business wants to glean insights from it by running analytics queries. Analytics activity, sometimes called OLAP (online analytical proces...
Read MorePaul Ramsey
Paul Ramsey
The number of cool things you can do with the http extension is large, but putting those things into production raises an important problem. The amount of time an HTTP request takes, 100s of milliseconds, is 10- to 20-times longer that the amount of time a normal database query takes. This means that potentially an HTTP call could jam up a query for a long time. I recently ran an HTTP function in an update against a relatively small 1000 record table. The query took 5 minutes to run, and durin...
Read More