This weekend Dr Nic shared with us a well written article discussing the shortcomings of auto-incrementing (serial) keys and an alternative approach. While discussing the article the question came up about how to go about using UUID keys in our applications and what we have to do within PostgreSQL in order to use them. PostgreSQL out of the box defines a UUID Data Type which is a great start. We then have three options for generating UUID's,

  1. Within your application code
  2. Within the database using the extension uuid-ossp
  3. Within the database using the extension pgcrypto

Generating them within our application will work quite nicely and is a good way to go, unless you want the database to handle creating these automatically for you which is usually where our point of view.

Only use uuid-ossp if you have a real and specific need for the functions it provides If your only need is to generate and index them then you do not need uid-ossp. (Update: See below.) For generating UUID's within the database an easy place to start is to use the gen_random_uuid () function from the pgcrypto extension.

So how do we use this function? First we need to create the extension in the database we wish to use it with,

CREATE EXTENSION pgcrypto;  

This is how you load pre-compiled shared library code which adds functionality into your PostgreSQL database.

Note specifically that the extension must be created (loaded) once for each database in which you wish to use it. Once it has been loaded into a running instance of the database server it will be there for use from then on spanning restarts.

Also be aware that if you have dumped and restored the database from one server to another running instance then depending on the method of the dump/restore you may need to load it into this new instance after restoring.

Once you have done this you can generate a random id,

SELECT gen_random_uuid();  

Which returns a UUID Data Type.

Let's create a table with a UUID primary key and see how we use the gen_random_uuid() function to populate our IDs for us,

CREATE SCHEMA IF NOT EXISTS snw;  
CREATE TABLE snw.contacts(  
   id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
   name TEXT,
   email TEXT
);

We can now add entries into our newly created Stark & Wayne Contacts table,

INSERT INTO snw.contacts (name,email) VALUES  
('Dr Nic Williams','drnic'),
('Brian Mattal','brian'),
('Wayne E. Seguin','wayneeseguin'),
('Long Nguyen','long'),
('Bill Chapman','bill'),
('Chris Weibel','chris'),
('Jeremey Budnack','jrbudnack'),
('Ruben Koster','rkoster'),
('Jamie Van Dyke','jamie'),
('Quintessence Anx','qanx'),
('McGowan','mcg'),
('高,秀娇 (XJ)','xj'),
('Geoff Franks','geoff'),
('Van Nguyen','vnguyen'),
('John Longanecker','jlonganecker')
;
INSERT 0 15  

Let's examine our table now,

SELECT * FROM snw.contacts;  
                  id                  |       name       |    email
--------------------------------------+------------------+---------------
 0d60a85e-0b90-4482-a14c-108aea2557aa | Dr Nic Williams  | drnic
 39240e9f-ae09-4e95-9fd0-a712035c8ad7 | Brian Mattal     | brian
 9e4de779-d6a0-44bc-a531-20cdb97178d2 | Wayne E. Seguin  | wayneeseguin
 66a45c1b-19af-4ab5-8747-1b0e2d79339d | Long Nguyen      | long
 bc8250bb-f7eb-4adc-925c-2af315cc4a55 | Bill Chapman     | bill
 200393bc-8aaa-45a8-9093-80c4792348cd | Chris Weibel     | chris
 cd881764-bea1-4249-b86d-f8fb8182eec1 | Jeremey Budnack  | jrbudnack
 970972dd-dce8-4c65-a85b-63735ada0fc9 | Ruben Koster     | rkoster
 1c225a3a-2c70-4d95-b87f-f086cbd20366 | Jamie Van Dyke   | jamie
 9f0bb16e-fc25-47f3-b60a-635b6224225a | Quintessence Anx | qanx
 9788c636-936e-4dd6-b9d5-f340329142bd | McGowan          | mcg
 3d8b664f-ef5f-4587-a45c-f2991a1fc029 | 高,秀娇 (XJ)    | xj
 f618f5a8-380e-44fb-9b4e-b3286f29dcc8 | Geoff Franks     | geoff
 35cb41dd-3edb-4483-a1fe-fc315243d2f8 | Van Nguyen       | vnguyen
 74b37cd4-75aa-4871-b17b-a5160428e589 | John Longanecker | jlonganecker
(15 rows)

We see that each row has a UUID id field which is the primary key and was automatically generated for us.

Discussion

It was pointed out by @drewblas (thank you Drew!) that using gen_random_uuid() from pgcrypto has a negative side effect with respect to keyspace fragmentation on disk for the tables. Drew told us that:

Random produces very fragmented inserts that destroy tables. Use uuid_generate_v1mc() [instead] ... the keys are seq because they're time based. So all inserts go to the same data page without random io.

This makes sense due to the random probability distribution of the keys, it should be fragemented. This fragmentation however is not so good for efficiency of the database system itself. In order to gain the benefits of using UUID primary keys with lower keyspace fragmentation perhaps Drew points out that it is better to use uuid_generate_v1mc() from the uuid-ossp extension because underneath it is using a time based seq algorithm which you can read about in the postgresql documentation.