Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Marco Slot
Marco Slot
We are excited to release Crunchy Data Warehouse, a modern data warehouse for Postgres. Crunchy Data Warehouse combines Postgres with Iceberg, Parquet, and data lake formats for fast analytics queries and cost efficient storage.
Marco Slot
Marco Slot
Operational and analytical workloads have historically been handled by separate database systems, though they are starting to converge. We built Crunchy Data Warehouse to put PostgreSQL at the frontier of analytics systems, using modern technologies like Iceberg and a hybrid query engine . Combining operational and analytical capabilities is extremely useful, but it is not meant to drive all your workloads into a single system. In most organizations, application developers and analysts work...
Read MoreLouise Grandjonc Leinweber
Louise Grandjonc Leinweber
Postgres does a great job of making queries really efficient. By gathering data in internal statistics tables, Postgres estimates before a query is run lots of things - like will an index scan be better than a sequential scan. How to pull data for the WHERE statement. What Postgres doesn’t know …. is how your columns are related to each other. Postgres isn’t a machine learning algorithm. It is not going to learn over time as you query things what is related and what isn't. It uses the same stat...
Read MoreAndrew L'Ecuyer
Andrew L'Ecuyer
In today's landscape of complex systems with numerous observability options, OpenTelemetry has emerged as the standard for collecting logging and metrics. It creates a vendor-agnostic platform that works with almost any source and destination by taking in logs and metrics from all components, standardizing them, and routing them where needed. Though setup requires effort, the payoff is substantial: a unified view of your entire system, even across distributed environments. With Crunchy Postgr...
Read MoreElizabeth ChristensenChristopher Winslett
Elizabeth ChristensenChristopher Winslett
Histograms were first used in a lecture in 1892 by Karl Pearson — the godfather of mathematical statistics. With how many data presentation tools we have today, it’s hard to think that representing data as a graphic was classified as “innovation”, but it was. They are a graphic presentation of the distribution and frequency of data. If you haven’t seen one recently, or don’t know the word histogram off the top of your head - it is a bar chart, each bar represents the count of data with a defined...
Read MoreGreg Nokes
Greg Nokes
Today's release of Crunchy Postgres for Kubernetes, version 5.8, is a substantial update that introduces a range of features designed to revolutionize your data infrastructure and observability. Whether you are a seasoned DevOps engineer, a database administrator, or an application developer seeking a reliable Postgres environment, this version offers enhancements to streamline your workflows and enhance the overall efficiency of your deployments. The developer-focused improvements are: • Enhanc...
Read MoreCraig Kerstiens
Craig Kerstiens
Today I'm excited to announce the release of Crunchy Data Warehouse on premises, which provides one of the easiest and yet richest ways to work with your data lake in the environment of your choosing. Built on top of Crunchy Postgres for Kubernetes, Crunchy Data Warehouse extends Postgres with a modern data warehouse solution, giving you: • The ability to easily query data where it resides in S3 or S3 compatible storage (like MinIO). With a variety of data formats supported including CSV, JSO...
Read MoreCraig Kerstiens
Craig Kerstiens
As a database service provider, we store a number of logs internally to audit and oversee what is happening within our systems. When we started out, the volume of these logs is predictably low, but with scale they grew rapidly. Given the number of databases we run for users on Crunchy Bridge, the volume of these logs has grown to a sizable amount. Until last week, we retained those logs in AWS CloudWatch. Spoiler alert: this is expensive. While we have a number of strategies to drive efficiency...
Read MoreElizabeth ChristensenDoug Hunley
Elizabeth ChristensenDoug Hunley
The Center for Internet Security (CIS) releases security benchmarks to cover a wide variety of infrastructure used in modern applications, including databases, operating systems, cloud services, containerized services, and even networking. Since 2016 Crunchy Data has collaborated with CIS to provide this security resource for those deploying Postgres. The output of this collaboration is a checklist for folks to follow and improve the security posture of Postgres deployments. The PostgreSQL CIS...
Read MoreÖnder Kalacı
Önder Kalacı
Today we're excited to announce built-in maintenance for Iceberg in Crunchy Data Warehouse . This enhancement to Crunchy Data Warehouse brings PostgreSQL-style maintenance directly to Iceberg. The warehouse autovacuum workers continuously optimize Iceberg tables by compacting data and cleaning up expired files. In this post, we'll explore how we handle cleanup, and in the follow-up posts, we'll take a deeper dive into compaction. If you use Postgres, you are probably familiar with tables and ro...
Read MoreGreg Sabino Mullane
Greg Sabino Mullane
Someone recently asked on the Postgres mailing lists about how to remove unwanted duplicate rows from their table. They are “unwanted” in that sense that the same value appears more than once in a column designated as a primary key. We’ve seen an uptick in this problem since glibc was kind enough to change the way they sorted things. This can lead to invalid indexes when one upgrades their OS and modifies the underlying glibc library. One of the main effects of a corrupted unique index is allowi...
Read MoreCraig Kerstiens
Craig Kerstiens
Citus is in a small class of the most advanced Postgres extensions that exist. While there are many Postgres extensions out there, few have as many hooks into Postgres or change the storage and query behavior in such a dramatic way. Most that come to Citus have very wrong assumptions. Citus turns Postgres into a sharded, distributed, horizontally scalable database (that's a mouthful), but it does so for very specific purposes. Citus, in general, is fit for these type of applications and only the...
Read More