Understanding the Postgres Hackers Mailing List Language
The Postgres hackers mailing list (pgsql-hackers@postgresql.org) is an invaluable resource for anyone wanting to contribute to the PostgreSQL code. The Postgres project does not use PRs (pull requests) or GitHub issues. So if you want to contribute an idea, or help with code reviews, the hackers mailing list is the canonical way to do so. More information on contributing is on the Postgres wiki at: https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer? My colleague Elizabeth Christensen has short primer talk on contributing to Postgres as well.
HOWEVER...the -hackers mailing list is busy. Very, very busy. And dense. And filled with jargon and obscure terms. In short, it does not make for light reading. But here's the most important rule: you don't have to read everything. You don't even have to read every subject line. Just do your best.
Do not worry if you do not understand everything you read. Very few people (hello to my colleague Tom! ) know the code so well they can easily figure out what every post on -hackers is talking about. I am not one of those people. You will not be either, and that is perfectly okay. Focus on the parts of Postgres that are of interest to you. Over time, you will be able to understand more and more.
Here are my rules for staying sane while reading the Postgres hackers mailing list:
- Do not try to read everything; skim the subject lines
- Use a good email client that does proper threading
- Set a routine - if you are reading even a small portion of the messages, you likely need to do it at least once a day
- Be willing to let threads go
- If your backlog gets too big, just "mark all read”!
Jargon
The mailing lists are full of acronyms and jargon that might not be familiar to younger people who did not grow up on email (although text messages have inherited many of the abbreviations). If you are a non-native English speaker, or under the age of 30, or not steeped in the world of tech, I offer some solutions below.
To do this, I downloaded the last year's worth of hackers email, wrote a program to strip out all the non-human stuff (headers, code blocks, attachments, etc.), and then did some data analysis on the results.
VERSION CONTROL
Postgres uses the git version control system, which means there is quite a bit of assumed git knowledge and git-related phrases, such as:
- diff = the output of a "diff" or "git diff" command, showing what has changed "Looking at the diffs of these, I wonder if …"
- 1f47afd8 = reference to a git object / commit ID Always at least a 7 digit hexadecimal, often 9, or 12, or sometimes longer "30e7c175b81 removed support for dumping..."
- HEAD - the current main, active branch within git "patch was not applying on top of HEAD because"
- rebase - the process of using 'git rebase' to create a new version of a patch, which takes into account code changes since the time the patch was created "Please find attached v2, mandatory rebase due to cd312adc56"
BUILDING, TESTING, and PATCHING
Making a patch means compiling Postgres with your changes, submitting one or more patches to the mailing list, entering it in the commitfest app, and having it go through the CI (continuous integration) system. Here are some related terms:
- +1 = a general stamp of approval for someone's idea or patch. Sometimes seen as disapproval via -1
- meson = a program used to build Postgres. The successor to the old "configure" system
- gcc, clang = two main compilers used to build Postgres. There are others.
- backpatching = the process of taking a patch and applying the changes to older (but still supported) major versions of Postgres. Not everything can be backpatched - the default is to NOT backpatch unless it is deemed important enough.
- 0001, 0002 = refers to the patch attached in the email thread. Most people create patches using "git format-patch", which creates one file per commit, and numbers them as 0001, 0002, 0003, etc.
- commitfest - An external site that lives here: https://commitfest.postgresql.org/ It's job is to store the current state of all patches, and it contains many links back to the mailing lists.
- CF = commitfest
- CFM = commitfest manager
- RMT = release management team
- buildfarm - An external site that lives here: https://buildfarm.postgresql.org/ This is a distributed CI (continuous integration) system, run by volunteers all over the world, that tests Postgres after each commit.
- curculio wrasse lorikeet - every server in the buildfarm gets assigned a unique animal name, so you may see some obscure animals mentioned on the lists.
- cfbot - an automated system that creates git threads based on the commitfest entries, and gathers the results of CI testing (http://cfbot.cputube.org/)
- Coverity = an external tool used to try and detect problems with the code
- s/This/That/ = someone is suggesting changing the text to replace "This" with "That"
- OOM = out of memory
Acronyms Used in Postgres Development
These are acronyms that are often used by people on the mailing list. The more common ones are at the top. Each has a sample usage as actually used on the list in the last year. The most common acronym is IMO aka "in my opinion". This proves that us hackers are a very opinionated bunch!
- IMO = in my opinion "IMO there should be some simple test cases"
- FWIW = for what it's worth "FWIW I think it's a fairly serious issue"
- IIUC = if I understand correctly "IIUC what you're saying is that we should"
- BTW = by the way "BTW some drivers also send Describe even before Bind"
- IMHO = in my humble opinion. A slightly politer way to say IMO "IMHO, this is not a good direction"
- LGTM = looks good to me (especially in regards to a patch) "Did a quick check and still LGTM"
- PFA / PSA = please find attached / please see attached "PFA the small patch that implements this"
- AFAICT = as far as I can tell "AFAICT, this world target doesn't include the man target"
- OTOH = on the other hand "OTOH, maybe the current code is more readable"
- IIRC = if I recall correctly "IIRC, we have some similar issues in other hooks"
- AFAICS = as far as I can see "AFAICS, only assert-enabled LLVM builds crash"
- WIP = work in progress "Here's a new WIP version of the patch”
- ISTM = it seems to me "ISTM that the fix here is to not use a spinlock"
- AFAIK = as far as I know "AFAIK that guarantees it happens after"
- TBH = to be honest "TBH, I am wondering what is the purpose of this sentence"
- FYI = for your information "FYI, I also ran the patch"
- WFM = works for me "Both suggestions WFM"
- IMV = in my view "You really need diagrams for something like this IMV"
- IOW = in other words "IOW, search path is a bandaid for this kind of thing”
- POV = point of view "From my POV the idea seems reasonable"
- OP = original poster (the person who started the email thread) "The OP suggests archiving the timeline history file"
- AFAIU = as far as I understand "AFAIU currently we do not add Memoize nodes"
- YMMV = your mileage may vary "I find it easy to read if the GUC parameters are quoted, but YMMV"
- FTR = for the record. Also JFTR = "just for the record"* "FTR this has been discussed in the past"
- IME = in my experience "often that's hard, and IME is rarely done"
- AFAIR = as far as I recall "AFAIR, we don't prevent similar invalidations"
- ASAP = as soon as possible "something merged that fixes the bug ASAP"
- AIUI = as I understand it "which AIUI has never been committed"
- TBD = to be determined ”something TBD at time of implementation"
- WRT = with regards to "discussion about this WRT renaming macros"
- WDYT? = what do you think? "I expanded that into the following. WDYT?"
- IDK = I don't know "IDK, to me something like this seems to promise more than we can actually use"
Common Postgres Acronyms
While this is far from a canonical list, here are some of the more common Postgres-related terms you might run across. Again, these are in order of frequency:
- WAL = write-ahead log
- GUC = grand unified configuration. Basically, a configuration variable
- LSN = log sequence number, a specific address in the WAL stream
- API = application programming interface
- OID = object identifier
- TOAST = the oversized attribute storage technique
- FSM = free space map
- SAOP = scalar array operator (note: not a misspelling of SOAP!)
- ABI = application binary interface
- RLS = row-level security
- DSM = dynamic shared memory
- TPS = transactions per second
- TLI = timeline ID
- EOL = end of life
- 2PC = two-part commit
- LRU = least recently used strategy
- PITR = point-in-time recovery
- CTAS = create table as select
- CIC = concurrent index creation
- TAM = table access method
- RNG = random number generator
- LTO = link time optimization
- POLA = principle of least astonishment
- LR = logical replication
FUNNY PHRASES
Finally, there are some phrases that might not have a direct translation from English, but are common on the list:
- spitballing = throwing ideas out without much concern of feasibility or correctness.
- footgun = something dangerous that might cause you to shoot your own foot off, in other words, something that may cause serious problems too easily
- paint into a corner = getting into a difficult situation with no way out
- bikeshedding = focusing on minor details of something (such as the color of a bike shed) rather than focusing on the larger, more important details (such as the material to use, how sturdy the shed is, etc.)
- firehose = a strong, excessive amount of information that can be hard to consume. Such as the pgsql-hackers mailing list!
That’s my collection! Again, this is not meant to be a canonical list, but I hope it is useful for those that may not be familiar with some of the terms above. If you are posting to the list, consider limiting the use of acronyms and jargon to only the most popular ones, or eschew them completely. TIA! (thanks in advance)