"Who is in charge of this database?"
Everyone on the DBA team shook their head and someone asked, "Is it Oracle or SQL server?"
"I think it is called My SQL," the development manager said.
During my 20 years as a database administrator, that is often how I ended up learning new RDBMS systems. As a result, I know first-hand how challenging it can be to learn another system when you get thrown into the deep end.
New databases are added to a production environment in different ways. A new software application is purchased and the installation requires (or builds) a new database. A developer builds a prototype application for a business unit. When they start using it, it quickly turns into an important system and the database gets handed to the production administrators to maintain and fix when it breaks. Sometimes a company determines that they need to move to a standard database system and away from another. This can require upskilling a whole team of administrators to learn the new system.
You might have been told, "Congratulations! You are now in charge of this database. I know you are an Oracle DBA, but PostgreSQL is almost the same thing so you should be able to handle it."
Yes, they are both relational database management systems (RDBMS) and you use (mostly) standard SQL to interact with them. But there are some differences that you need to understand.
I am going to start with the architecture basics, a single database on a single machine. I’m going to cover some of the similarities and differences between Oracle and PostgreSQL to help you get up to speed quickly. In this article, I am also going to focus on explaining the architecture at a high level and some of the key aspects to check to make sure that your new database does not have any glaring problems.
In Oracle, the service consists of multiple background processes and memory segments (called the instance) and the files that contain configuration, transaction (or archive) logs, data, and index information (called the database).
In PostgreSQL, there is one process called
postgres that manages the database
file, accepts connections from the client applications, and performs actions on
behalf of those clients. PostgreSQL also has data files and archive log files
(called WAL files).
Here is a table to compare some of the common functions of these 2 database systems:
|Frequently used in Oracle for load balancing. Not as commonly used in PostgreSQL
|CREATE USER in PostgreSQL creates a role with login privilege.
|In Oracle, only a collection of privileges. In PostgreSQL see the prior line.
|Extract a copy of data from the database
|Restore data into the database
|Open source model
|Oracle Enterprise Manager (OEM)
|Graphic administration utility
|An array of third parties
So once you understand the basics of PostgreSQL, the next thing you will want to do is perform a health check. Most DBAs have developed some form of this. It usually starts by learning what is easier to prevent happening than to fix later. As someone new to Postgres, here’s a quick primer on some of the areas you’ll want to focus on:
There are a few main approaches to backing up a PostgreSQL database. They each have strengths and weaknesses, so I will cover them here briefly and provide links where you can get more information if you would like to go deeper.
The pg_dump utility will create a file of SQL commands to be able to recreate the database in the same state as when the utility was run. You can read more about it in the pg_dump documentation.
If you can stop your database, you can make a copy of all the files. This is often impractical for systems with high availability requirements, but it can be a useful method to make sure you have a known good backup. When doing significant changes to a database, this is often a smart extra measure to recover if anything goes wrong. More information on file system backups in PostgreSQL can be found file system backup section of the Crunchy Data documentation.
PostgreSQL uses logging to provide crash consistency and to be able to restore the database to a consistent state in the event of a system crash. These logs are called write-ahead log or WAL. PostgreSQL can use the WAL files to restore any modifications made to the data in your database. You can read more about continuous archiving in Crunchy’s docs.
pgBackRest is a full solution to backup and restore even the largest PostgreSQL databases in a simple and easy-to-use way. With pgBackRest, you can customize your backup process to include parallelism to accelerate backup and recoveries, checksum verification to verify integrity, as well as many other features. (A future post will take a more detailed look at backup options in PostgreSQL and how they compare to Oracle. Stay tuned.)
To get started in checking your Postgres backups:
- Step 1: Ensure you’re taking backups. If you don’t have any backups at all currently this article is a good place to get started with pgBackRest.
- Step 2: Document the process for performing a recovery. If you have a failure in the middle of the night you don’t want to trust your gut, but rather have a very clear step by step playbook. Even better if you go one step further and automate the process of performing recovery.
- Finally, test the process regularly. If you perform these tests using the documentation from the previous step (and you should) you will be able to continue to improve the documented process when you hit unexpected bumps in the road (and you will). These bumps in the road are an inconvenience if you are doing testing which is better than discovering them when you are attempting to resolve data loss when it will be potentially a business (and career) impacting event. We at Crunchy Data recommend at least quarterly recovery tests and more often for sites with high availability requirements.
Now it is time to make sure the backup strategy meets the expectation of the business. Taking a backup once a week may "check the box" of backing up the database, but there are a lot of other factors to consider.
Security should be high on the list of everyone working in IT, but especially for database administrators because of the value of the data contained and the negative impact losing control of that information could have on the business.
First of all, check to see what version you are on. You do not need to be on the latest major version, but keeping close to current on the latest minor version will make sure that you are protected from most of the CVE’s that could put you at risk. If you’re out of date or close to it, make plans to update.
Two ways you can find the PostgreSQL version are:
- At a command line on the database node, as a user that has access to run PostgreSQL run:
- While logged in to the database, run the following command:
Review the users that have access and check to see if they are all valid and have proper access controls (i.e. passwords) in place.
A good place to start this is with a simple
postgres-# \du command and get
validation that all the users listed have a place in your database. Often users
get created but when the employee leaves the company, their accounts are not
fully removed from all the systems they used.
Looking at the “Member of” field will highlight some of the users that may be in groups that no longer apply to them. For example an employee that was in the Sales group but moved to Marketing may belong to both “Sales” and “Marketing” groups when they should be removed from Sales.
Make sure that you are using a secure version of tools where appropriate and have disabled the insecure version (ie scp, ssh, etc).
File transfer has long been done with the file transfer protocol (FTP) and in some cases, it may still be the right tool for the job as it is faster than alternatives such as SFTP. However, FTP is lacking in security and by using SFTP, you gain the additional advantage of encrypted communication.
The same considerations apply to other utilities that are used in the database environment so make sure you have addressed the security requirements for your organization. These will vary based on the needs of the business, but best practices will usually involve using encrypted options over non-encrypted ones.
There are several ways to approach that investigation. One is to use a benchmark like the one from the Center for Internet Security. If security is of utmost importance for your enterprise, then you may want to investigate the additional protection available with a secure implementation of PostgreSQL. A solution such as Crunchy Hardened can provide the higher level of protection that some companies require.
Once you have backups running and you’re on a secure release of Postgres, make sure the rest of the foundation matches your organization's requirements. Crunchy Data’s learning portal has several beginner PostgreSQL administration courses for you to learn more.
January 12, 2022 •More by this author