Please Note: This post references an older version of the Crunchy Postgres for Kubernetes. See PGO Documentation for the latest version.
The Crunchy Data team announced the latest release of our open source PostgreSQL Operator for Kubernetes 4.6 a few weeks back. So let's take a whirlwind tour of how we make it easy to run production-quality Postgres on Kubernetes. With this release, we included features to streamline management of the Operator, added security features, and extra system metrics to enhance your high availability Kubernetes Postgres cluster.
Let's take a look at what's new in the Postgres Operator 4.6!
- Rolling Updates
- Pod Tolerations
- Node Affinity Enhancements
- TLS for pgBouncer
- Enable/Disable Sidecars with Rolling Updates
- Streamlined container images
Rolling restart and update policies are important for high availability environments. With rolling restarts, changes and configurations can be managed with minimal disruption to zero disruption.
This release introduces a mechanism for the PostgreSQL Operator to perform rolling updates on certain operations that change the deployment templates and through the pgo restart command with the -- rolling flag. Some of the operations that will trigger a rolling update include:
- Memory resource adjustments
- CPU resource adjustments
- Custom annotation changes
- Tablespace additions
- Adding/removing the metrics sidecar to a PostgreSQL cluster
For example, let's say I want to resize the memory of my HA Postgres cluster from 4Gi to 8Gi. If I execute the following command:
pgo update cluster hippo --memory=8Gi
The Postgres Operator will roll out the change in the following way:
- The PostgreSQL Operator will first apply the memory changes to the replicas one-at-a-time and ensure they are healthy before proceeding.
- Once the change has rolled out to all the replicas, the Postgres Operator will perform a "switchover" (aka a "controlled failover") to the best replica candidate. This will promote a replica to become a new primary.
- The previous primary updates with the new memory settings, and the cycle is complete!
Taints and tolerances are important Kubernetes features for managing pod scheduling. Kubernetes Tolerations can help with the scheduling of Pods to appropriate Nodes based upon the taint values of said Nodes. For example, a Kubernetes administrator may set taints on Nodes to restrict scheduling to the database workload, and as such, tolerations must be assigned to Pods to ensure they can actually be scheduled on those nodes.
This release introduces the ability to assign tolerations to PostgreSQL clusters managed by the PostgreSQL Operator. Tolerations can be assigned to every instance in the cluster via the tolerations attribute on a pgclusters.crunchydata.com custom resource, or to individual instances using the tolerations attribute on a pgreplicas.crunchydata.com custom resource.
Node affinity has been a feature of the PostgreSQL Operator for a long time but has received some significant improvements in this release.
It is now possible to control the node affinity across an entire PostgreSQL cluster as well as individual PostgreSQL instances from a custom resource attribute on the pgclusters.crunchydata.com and pgreplicas.crunchydata.com CRDs. These attributes use the standard Kubernetes specifications for node affinity and should be familiar to users who have had to set this in applications.
Additionally, this release adds support for both “preferred” and “required” node affinity definitions. Previously, one could achieve required node affinity by modifying a template in the pgo-config ConfigMap, but this release makes this process more straightforward.
Since 4.3.0, the PostgreSQL Operator had support for TLS connections to PostgreSQL clusters and an improved integration with pgBouncer used for connection pooling and state management. But, the integration with pgBouncer did not support TLS directly. It could be achieved through modifying the pgBouncer Deployment template.
This release brings TLS support for pgBouncer to the PostgreSQL Operator, allowing for communication over TLS between a client and pgBouncer, and pgBouncer and a PostgreSQL server. In other words, the following is now supported:
Client <= TLS => pgBouncer <= TLS => PostgreSQL
In other words, to use TLS with pgBouncer, all connections from a client to pgBouncer and from pgBouncer to PostgreSQL must be over TLS. This is “TLS only” mode if connecting via pgBouncer.
A common case is that one creates a PostgreSQL cluster with the Postgres Operator and forgets to enable it for monitoring with the --metrics flag. I've definitely done this myself. Before 4.6, adding the metrics collection sidecar (crunchy-postgres-exporter) to an already running PostgreSQL cluster was challenging.
This release brings the --enable-metrics and --disable-metrics introduces to the pgo update cluster flags that allow for monitoring to enable or disable on a running PostgreSQL cluster. As this involves modifying deployment templates, this action triggers a rolling update described in the previous section to limit downtime.
Metrics can also be enabled/disabled using the exporter attribute on the pgclusters.crunchydata.com custom resource.
This release also changes the management of the PostgreSQL user used to collect the metrics. Similar to pgBouncer, the PostgreSQL Operator fully manages the credentials for the metrics collection user. The --exporter-rotate-password flag on pgo update cluster can be used to rotate the metric collection user’s credentials.
We didn't forget about pgBadger either. Postgres Operator 4.6 also brought the --enable-pgbadger and --disable-pgbadger to the "pgo update cluster" command, so you can choose whenever you want to perform query analysis.
Less is more: advances in Postgres Operator functionality have allowed for a culling of the number of required container images. For example, functionality broken out into individual container images (e.g. crunchy-pgdump) is now consolidated within the crunchy-postgres container. In fact, we eliminated ten container images, reducing the number of containers required to run the Postgres Operator.
In addition to improving organization and build performance optimizations around the container suite, this reduces the amount of storage required to store the Postgres Operator images in your registry.
This is just the tip of the iceberg on all the new features in Postgres Operator 4.6. For more information, check out the release notes.
We’re excited to release Crunchy Postgres Operator 4.6, helping you manage your cloud native Postgres. If you give Operator 4.6 a test, let us know how it's going on Twitter or LinkedIn. And if you’re already using our Operator, consider starring it on GitHub.
Jonathan S. Katz
March 16, 2021 •More by this author