Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

  • 6 min read

    Musings of a PostgreSQL Data Pontiff Episode 1

    Joe Conway

    This is the first in a series of blogs on the topic of using PostgreSQL for "data science". I put that in quotes because I would not consider myself to be a practicing data scientist, per se. Of course I'm not sure there is a universally accepted definition of data scientist. This article provides a nice illustration of my point. I do believe my credentials are such that no one can accuse me of term appropriation. Toward establishment of that end, this first installment is a walk down memory l...

    Read More
  • 6 min read

    Announcing the Crunchy Postgres Operator 4.6.0 with rolling updates, pod tolerations, node affinity and more

    Jonathan S. Katz

    Please Note: This post references an older version of the Crunchy Postgres for Kubernetes. See PGO Documentation for the latest version. The Crunchy Data team announced the latest release of our open source PostgreSQL Operator for Kubernetes 4.6 a few weeks back. So let's take a whirlwind tour of how we make it easy to run production-quality Postgres on Kubernetes. With this release, we included features to streamline management of the Operator, added security features, and extra system metric...

    Read More
  • 4 min read

    Enhancing PostgreSQL 13 Security with the CIS Benchmark

    Douglas Hunley

    Crunchy Data has recently announced an update to the CIS PostgreSQL Benchmark by the Center for Internet Security , a nonprofit organization that provides publications around standards and best practices for securing technologies systems. This newly published CIS PostgreSQL 13 Benchmark joins the existing CIS Benchmarks for PostgreSQL 9.5, 9.6, 10, 11, and 12 while continuing to build upon the PostgreSQL Security Technical Implementation Guide (PostgreSQL STIG ). A CIS Benchmark is a set...

    Read More
  • 4 min read

    Helm, GitOps and the Postgres Operator

    Jonathan S. Katz

    This post provides guidance for v4x. For the latest on PGO, GitOps and Helm installer, please see: https://github.com/CrunchyData/postgres-operator-examples/tree/main/helm In the previous article , we explored GitOps and how to apply GitOps concepts to PostgreSQL in a Kubernetes environment with the Postgres Operator and custom resources. The article went on to mention additional tooling that has been created to help employ GitOps principles within an environment, including Helm . While the m...

    Read More
  • 6 min read

    Fuzzy Name Matching in Postgres

    Paul Ramsey

    A surprisingly common problem in both application development and analysis is: given an input name, find the database record it most likely refers to. It's common because databases of names and people are common, and it's a problem because names are a very irregular identifying token. The page " Falsehoods Programmers Believe About Names " covers some of the ways names are hard to deal with in programming. This post will ignore most of those complexities, and deal with the problem of matching up...

    Read More
  • 9 min read

    Using PostgreSQL to Shape and Prepare Scientific Data

    Steve Pousty

    Today we are going to walk through some of the preliminary data shaping steps in data science using SQL in Postgres. I have a long history of working in data science , including my Masters Degree (in Forestry) and Ph.D. (in Ecology) and during this work I would often get raw data files that I had to get into shape to run analysis. Whenever you start to do something new there is always some uncomfortableness . That “why is this so hard” feeling often stops me from trying something new, but...

    Read More
  • Query Optimization in Postgres with pg_stat_statements

    Kat Batuigas

    "I want to work on optimizing all my queries all day long because it will definitely be worth the time and effort," is a statement that has hopefully never been said. So when it comes to query optimizing, how should you pick your battles? Luckily, in PostgreSQL we have a way to take a system-wide look at database queries: • Which ones have taken up the most amount of time cumulatively to execute • Which ones are run the most frequently • And how long on average they take to execute Which ones ha...

    Read More
  • 9 min read

    Deep PostgreSQL Thoughts: Resistance to Containers is Futile

    Joe Conway

    Recently I ran across grand sweeping statements that suggest containers are not ready for prime time as a vehicle for deploying your databases. The definition of "futile" is something like "serving no useful purpose; completely ineffective". See why I say this below, but in short, you probably are already, for all intents and purposes, running your database in a "container". Therefore, your resistance is futile. And I'm here to tell you that, at least in so far as PostgreSQL is concerned, those...

    Read More
  • 8 min read

    ArcGIS Feature Service to PostGIS: The QGIS Way

    Kat Batuigas

    As a GIS newbie, I've been trying to use local open data for my own learning projects. I recently relocated to Tampa, Florida and was browsing through the City of Tampa open data portal and saw that they have a Public Art map . That looked like a cool dataset to work with but I couldn't find the data source anywhere in the portal. I reached out to the nice folks on the city's GIS team and they gave me an ArcGIS-hosted URL. To get the public art features into PostGIS I decided to use the "ArcG...

    Read More
  • 5 min read

    Kubernetes Pod Tolerations and Postgres Deployment Strategies

    Jonathan S. Katz

    The desire to use Pod tolerations to schedule Postgres instances sometimes comes up around complex Kubernetes deployments. To address this feedback, we added support for tolerations to the 4.6 release of the Postgres Operator along with improvements to using node affinity . To use tolerations with PostgreSQL deployments, it helps to understand some of the mechanics behind several Kubernetes features to get the desired result of deploying PostgreSQL to a specific node group. Let's take a loo...

    Read More