Composite Primary Keys, PostgreSQL and Django
Today’s blog post is going to be a nice little adventure of learning how to use composite primary keys in a PostgreSQL many-to-many relationship table while building a Django application. Along the way we will talk about some basics of Django and some workarounds you need to use. Let’s dig in and get started.
Here on the developer relations team at Crunchy Data, we have started building a demo application that manages Dungeon and Dragons (D&D) players, characters, dungeon masters, and campaigns (5th Edition if you must know). We decided to use Django because it’s in Python, a language that we know and love, it has great support for PostgreSQL, and it has a built in ORM.
Now you have to understand, I learned good database design (normal-form and all that stuff) very early on in my developer journey. So when it comes time for me to design an application I ALWAYS design the database first.
For our application there was a many-to-many relationship between “person” (the people playing the game), and the campaigns (the story they are working through). A person can play in multiple campaigns and we wanted a nice way to look up the contact information for all players in a campaign. We also wanted to recognize that a normal player may also be a dungeon master - the person in charge in the game - for one or more campaigns. All of these facts required us to make a many-to-many relationship through a middle “join table”, named person_campaign.
Since the pairing of the person foreign key and the campaign foreign are guaranteed to be unique, there is no reason to make a separate primary key in the person_campaign table. PostgreSQL allows you to specify a composite primary key (a primary key that uses multiple columns) and we took advantage of it.
Getting the Database into Django
With our database design done, we exported our model to a running PostgreSQL instance and then built our Django application. We did the normal steps to start a new Django application, and then when it came time to build our models we used the recommended step for creating the object relational mapping files:
python manage.py inspectdb > models.py
The manage.py command inspectdb, looks at the DB defined in the settings.py file and then proceeds to do its best to automatically create the right mapping from tables and columns to Django model objects. We pipe the output from that command to models.py which is the expected file name for the object relational mappings in a Django application.
Now we made the naive decision that Django would “do the right thing” when coming up with the mappings. Note: The documentation even states that you should inspect the mappings and correct any mistakes, this was not a wise decision on our part.
So we just went on developing and tried to create the form for people to register for campaigns and that’s when the fun started.
Things Start Going Wrong
When we tried to look at the standard “administration view” of the person_campaign table we noticed some problems. So we decided to look at what Django had created for its mapping for the table in models.py, and this is what we saw:
class PersonCampaign(models.Model): is_dm = models.BooleanField(blank=True, null=True) campaign_id_campaign = models.OneToOneField(Campaign, models.DO_NOTHING, db_column='campaign_id_campaign', primary_key=True) notes = models.TextField(blank=True, null=True) user_id_person = models.ForeignKey(Person, models.DO_NOTHING, db_column='user_id_person') class Meta: managed = False db_table = 'person_campaign' unique_together = (('campaign_id_campaign', 'user_id_person'),)
The two column mappings we are going to focus on are campaign_id_campaign, which is the foreign key to the campaign table, and user_id_person, the foreign key to the person table.
The user_id_person column mapping is correct given our database schema. Django mapped this as a foreign key as indicated by the models.ForeignKey, though it did miss that it was supposed to be part of the primary key. But Django didn’t get the campaign_id_campaign column correct. First, it said this column is actually part of a one-to-one relationship, denoted by the model.OneToOneField relationship field. I also said this column was the primary key, denoted by method parameter primary_key = True.
At this point we said - “Well that’s not good”. And proceeded to search the internet for what happened and learned:
Django does not support composite primary keys. In fact, there is a 15 year old ticket open for this.
We Don’t Need No Stinkin’ Composite Keys
Right, we know how to fix this one. We will just make a new primary key column that is an auto-incrementing integer and let the other two columns just be foreign keys.
Now we have heard from several Django advanced users that once you import your database structure into your models.py you shouldn’t really make manual changes to the underlying DB. The recommended method is to:
- Make changes in your models.py
- Run manage.py makemigrations
- Run manage.py migrate
So we change our model for person_campaign to look like this. (here are the two fields highlighted)
class PersonCampaign(models.Model): person_campaign_id = models.AutoField(primary_key=True) is_dm = models.BooleanField(blank=True, null=True) campaign_id_campaign = models.ForeignKey(Campaign, models.DO_NOTHING, db_column='campaign_id_campaign') notes = models.TextField(blank=True, null=True) user_id_person = models.ForeignKey(Person, models.DO_NOTHING, db_column='user_id_person') class Meta: managed = False db_table = 'person_campaign' unique_together = (('campaign_id_campaign', 'user_id_person'),)
The changes were quite simple: just make campaign_id_campaign into a foreign key (rather than one-to-one) and remove the primary key designation. Then we created a new column:
- Named the column person_campaign_id
- Made it an auto-generated integer field with model.Autofield
- Designated it as the primary key
The Part of the Plot Where Incomplete Doc Wastes Half a Day
Great, looks straightforward. Things are going to go great, right? Well, we try to create a new migration, and we see that it doesn't detect any changes to be applied. Back to the internet we go.
We suspect it may have something to do with class Meta with managed = False. We go to read the doc:
If False, no database table creation or deletion operations will be performed for this model.
Well that doesn’t say anything about altering a table. Guess the problem is somewhere else… Snip out half a day of trying other solutions and undoing them when they don’t work. Finally we see this auto-generated comment line at the top of the models.py:
managed = Falselines if you wish to allow Django to create, modify, and delete the table
MODIFY, MODIFY… Now you tell us. That line in the doc needs to read:
… no database creation, modification, or deletion operations…
We change to managed = True and it now generates a migration file. VICTORY!
Not So Fast - There is More Work Here
When we go to apply our migration we run into an error - can you guess what it is? That’s right, there is still the primary key defined in PostgreSQL, so the Django migration can’t create a new primary key on the table. This was an easy fix: log into the PostgreSQL database as a super user and run.
alter table person_campaign drop constraint person_campaign_pk;
We ran the migrate operation again and we are prompted to come up with a default value for the new primary key field because a primary key value can’t = null. The problem is we set the field in PostgreSQL to an auto-incrementing integer, thereby not needing a default value. If we try applying this migration, Django gives us an exception because Postgres does not like the default value field. We actually went into the migration file and deleted the default values from the operations. After deleting them we ran the migration which applied all the other changes. Then we ran makemigration again which now generates the same steps trying to create default value. BUT this time we run migrate --fake which doesn’t really run the SQL steps but updates Django’s migration table.
End of the Road and Summary
With that final hurdle cleared, we actually got our database updated and matching our Django models. Now we have a table handling our many-to-many connection without the composite keys.
Let’s recap the lesson learned today:
- If you are going to use Django with your PostgreSQL database DO NOT use composite primary keys.
- Remember, in the Meta inner class for a model in Django, managed = False means Django can not make any changes to the underlying DB schema.
- The internet is your friend for looking up issues related to Django - one of the benefits of using a project backed by a large and friendly community.
As always, we would love to hear your experience using Django with PostgreSQL. As we in the developer relations group dig deeper into building our demo application, expect to see more ways you can combine the power of Django with PostgreSQL. Feel free to leave comments and suggestions here or on Twitter. Happy coding!
June 26, 2020 •More by this author