Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Paul Ramsey
Paul Ramsey
Generating random numbers is a surprisingly common task in programs, whether it's to create test data or to provide a user with a random entry from a list of items. PostgreSQL comes with just a few simple foundational functions that can be used to fulfill most needs for randomness. Almost all your random-ness needs will be met with the function. The function returns a double precision float in a continuous uniform distribution between 0.0 and 1.0. What does that mean? It means that you c...
Read MoreChristopher Winslett
Christopher Winslett
Postgres has been steadily building on the JSON functionality initially released more than 10 years ago . With Postgres 16, working with JSON has gotten a couple nice improvements. Primarily, this release added features that ease the manipulation of data into JSON and improve the standard SQL functionality using JSON. TL;DR: • A SQL/JSON data-type check. For instance, this lets you ask with SQL if something • Addition of SQL-standard JSON functions: , , , and A SQL/JSON data-type check. For...
Read MoreBrian Pace
Brian Pace
Support for logical replication arrived at Postgres just over five years ago with Postgres 10. Since then it's had a steady stream of improvements, but mostly logical replication has been limited to migrating data or unidirectional change data capture workflows. With Postgres 16 freshly released today, Postgres now has a better foundation to leverage logical replication for active-active setups. If you're unfamiliar with the concepts of logical replication or what does active-active mean we've g...
Read MorePaul Ramsey
Paul Ramsey
A user on the postgis-users had an interesting question today: how to generate a geometry column in PostGIS with random points, linestrings, or polygons? Random data is important for validating processing chains, analyses and reports. The best way to test a process is to feed it inputs! Random points is pretty easy -- define an area of interest and then use the PostgreSQL function to create the X and Y values in that area. Filling a target shape with random points is a common use case, and...
Read MoreChristopher Winslett
Christopher Winslett
Postgres’ pgvector extension recently added HNSW as a new index type for vector data. This levels up the database for vector-based embeddings output by AI models. A few months ago, we had written about approximate nearest neighbor pgvector performance using the available list-based indexes . Now, with the addition of HNSW, pgvector can use the latest graph based algorithms to approximate nearest neighbor queries. As with all things databases, there are trade-offs, so don’t throw away the list...
Read MoreElizabeth Christensen
Elizabeth Christensen
Postgres databases are very compliant, they do what you tell them until you tell them to stop. It is really common for a runaway process, query, or even something a co-worker runs to accidentally start a never ending transaction in your database. This potentially uses up memory, i/o, or other resources. Postgres has no preset default for this. To find out your current setting: A good rule of thumb can be a minute or a couple minutes. This is a connection-specific setting, so you’ll need to rec...
Read MoreChristopher Winslett
Christopher Winslett
Note: We have additional articles in this Postgres AI series . Vector data has made its way into Postgres and I’m seeing more and more folks using it by the day. As I’ve seen use cases trickle in, I have been thinking a lot about scaling data and how to set yourself up for performance success from the beginning. The two primary trade-offs are performance versus accuracy. When seeking performance with vector data, we are using nearest neighbor algorithms, and those algorithms are built around p...
Read MoreBob Pacheco
Bob Pacheco
When working with containers you always have to be mindful of the age of the containers. Every day new CVEs are being discovered and are turning up in image scans. One benefit of having a CI/CD pipeline is the ability to implement security automation. Let's assume you release a monthly update of your containers that are built on the latest version of the base image and all of the most recent patches have been applied. This ensures that each month you can remediate any CVEs that might have popped...
Read MoreElizabeth Christensen
Elizabeth Christensen
Beyond a basic query with a join or two, many queries require extracting subsets of data for comparison, conditionals, or aggregation. Postgres’ use of the SQL language is standards compliant and SQL has a world of tools for subqueries. This post will look at many of the different subquery tools. We’ll talk about the advantages and use cases of each, and provide further reading and tutorials to dig in more. I’ll take a broad definition of “subquery”. Why am I calling all of these subqueries? The...
Read MoreGreg Sabino Mullane
Greg Sabino Mullane
We've also loaded a tutorial for Day 19's challenge if you want to try it with a pre-loaded data set. This article will contain spoilers both on how I solved 2022 Day 19's challenge "Not Enough Minerals" using SQL, as well as general ideas on how to approach the problem. I recommend trying to solve it yourself first, using your favorite language. Day 19 tasks us with creating lots and lots of mini robots to gather resources and feed our herd of elephants (rescued a few days back ). We'll do...
Read More