Crunchy Data joins Snowflake.  Read the announcement

  • 7 min read

    Using PostgreSQL and SQL to Randomly Sample Data

    Steve Pousty

    In the last post of this series we introduced trying to model fire probability in Northern California based on weather data. We showed how to use SQL to do data shaping and preparation. We ended with a data set that was ready with all the fire occurrences and weather data in a single table almost prepped for logistic regression. There is now one more step: sample the data. If you have worked with logistic regression before you know you should try to balance the number of occurrences (1) with a...

    Read More
  • Online Upgrades in Postgres

    James Chanco Jr.

    In our previous blog post, we talked about upgrading a PostgreSQL cluster to a new major version with pg_upgrade . This can be fast and with little downtime even for large databases, but in some instances zero downtime may be essential for doing a major upgrade. This method is called an "online upgrade" and can be achieved through logical replication . While logical replication can help to achieve a zero-downtime, online upgrade, there are still some things to consider. For some hands-on exp...

    Read More
  • Tuning Your Postgres Database for High Write Loads

    Tom Swartz

    As a database grows and scales up from a proof of concept to a full-fledged production instance, there are always a variety of growing pains that database administrators and systems administrators will run into. Very often, the engineers on the Crunchy Data support team help support enterprise projects which start out as small, proof of concept systems, and are then promoted to large scale production uses. As these systems receive increased traffic load beyond their original proof-of-concept s...

    Read More
  • PostgreSQL Monitoring for App Developers: Alerts & Troubleshooting

    Jonathan S. Katz

    We've seen an example of how to set up PostgreSQL monitoring in Kubernetes . We've looked at two sets of statistics to keep track of it in your PostgreSQL cluster: your vitals (CPU/memory/disk/network) and your DBA fundamentals . While staring at these charts should help you to anticipate, diagnose, and respond to issues with your Postgres cluster, the odds are that you are not staring at your monitor 24 hours a day. This is where alerts come in: a properly set up alerting system will let...

    Read More
  • PostgreSQL Monitoring for Application Developers: The DBA Fundamentals

    Jonathan S. Katz

    I am an accidental DBA, with a huge emphasis on "accidental." I came to PostgreSQL as an application developer who really liked to program with SQL and use the database to help solve my problems. Nonetheless, these systems would enter into production, and as such I had to learn to support them. PostgreSQL monitoring and performance optimization is a vast topic . In fact, I'll read content like what my colleague Greg Smith wrote on benchmarking PostgreSQL 13 on Ubuntu and be reminded that I h...

    Read More
  • PostgreSQL Monitoring for Application Developers: The Vitals

    Jonathan S. Katz

    My professional background has been in application development with a strong affinity for developing with PostgreSQL (which I hope comes through in previous articles ). However, in many of my roles, I found myself as the "accidental" systems administrator, where I would troubleshoot issues in production and do my best to keep things running and safe. When it came to monitoring my Postgres databases, I initially took what I knew about monitoring a web application itself, i.e. looking at CPU, m...

    Read More
  • How to Setup PostgreSQL Monitoring in Kubernetes

    Jonathan S. Katz

    You don't need monitoring until you need it. But if you're running anything in production, you always need it. This is particularly true if you are managing databases. You need to be able to answer questions like "am I running out of disk?" or "why does my application have degraded performance?" to be able to troubleshoot or mitigate problems before they occur. When I first made a foray into how to monitor PostgreSQL in Kubernetes , let alone in a containerized environment, I learned that a l...

    Read More
  • Synchronous Replication in PostgreSQL

    David Youatt

    PostgreSQL has supported streaming replication and hot standbys since version 9.0 (2010), and synchronous replication since version 9.1 (2011). Streaming replication (and in this case we're referring to "binary" streaming replication, not "logical") sends the PostgreSQL WAL stream over a network connection from primary to a replica. By default, streaming replication is asynchronous: the primary does not wait for a replica to indicate that it wrote the data. With synchronous replication, the...

    Read More
  • PostgreSQL 13 Upgrade and Performance Check on Ubuntu/Debian: 1.6GB/s random reads

    Greg Smith

    PostgreSQL 13 was released last week. I'm excited about this one, as the more mature partitioning plus logical replication features allow some long-requested deployment architectures. I ran 13 through my usual 144 test quick spin to see if everything was working as expected. Mainly boring stuff, but I was pleased to see that with the simple 128 client/4X RAM benchmark workload, Postgres 13 is driving 1.6GB/s of random read traffic requests to my PCI-e 4.0 NVM-e SSD. It keeps up with a whole RAI...

    Read More
  • 14 min read

    Using Postgres and pgRouting To Explore The Smooth Waves of Yacht Rock

    John Porvaznik

    pgRouting is a powerful routing tool, usually used for pathfinding/mapping/direction applications. (See Paul Ramsey's introduction to pgRouting here ). It is, however, also a robust graph db implementation, and can be used for much more than just finding the directions to your great aunt Tildy’s. Yacht Rock (as if you didn’t know) is a music genre created well after its active era. It’s characterized by smooth dulcet sounds that bring to mind wavy blond-haired waspy men in boat shoes, and ult...

    Read More