Generating Custom Sequences in Postgres

Postgres, unlike MySQL, uses sequences for auto-incrementing IDs. It’s a very powerful feature, and gives you a lot of control over how to store a primary key.

But say you want to generate a unique value for a field in a table, only this value is visible to the end user, and always increments by a set amount, regardless of what the primary key’s value. And, because this is a multi-tenant database, the primary key is guaranteed not to be sequential. You might also want to prefix this value with a short string, indicating what its value is.

For example, you have an ecommerce platform and every order that is placed gets its own number. You don’t want to just use a field like order_id as the order number, because you want to prefix each order number with the string ORD. A new account may be the millionth on the platform, and the order_id for their first order is 1484002193, or a UUID, and not something you would generally want to expose.

You can’t easily take advantage of Postgres’ native sequence handling to get this to work. To use a native sequence, you would have to create a new one for each new account, and ostensibly for each type of object you wanted to have a sequence for (orders, customer numbers, invoice numbers, etc). Postgres sequences are also not bound by a transaction which ensures they are globally unique (a sequence incremented by nextval() in a transaction is still incremented even if the transaction is rolled back). And, finally, your userland code would be responsible for generating the next sequence, not the database.

Clearly, using Postgres sequences is not ideal. Lets look at how we can create a small table and stored procedure to generate a sequence.

To begin, we’ll need a table to track sequence names, the account they are associated with, a prefix, and their next value.

CREATE TABLE account_sequence (
  sequence_id serial NOT NULL,
  account_id integer NOT NULL REFERENCES account (account_id) ON DELETE CASCADE,
  sequence text NOT NULL,
  prefix text NOT NULL DEFAULT ''::text,
  next_value integer NOT NULL DEFAULT 1,
  zero_pad integer,
  CONSTRAINT account_sequence_pkey PRIMARY KEY (sequence_id)

CREATE UNIQUE INDEX account_sequence_account_id_sequence_idx ON account_sequence (account_id, sequence);

This table forces each sequence name to be unique for each account. Thus, you could have one sequence named customer, one named order, and one named invoice. Additionally, it allows you to zero-pad the sequence value so you could generate sequences like 0001, 0002, 0003, 0004, etc.

You will notice we store the next value of the sequence rather than the current value. This makes it very easy to calculate the next sequence value, and allows you to easily query the table to see what it will be without having to run the stored procedure to generate the sequence. It also allows you to start the sequence at a value other than 1. A customer migrating to your platform could start their new order numbers at a much higher value to easily indicate where an order was placed.

Next, we’ll write a short stored procedure named generate_sequence() to generate the next value in a sequence.

CREATE OR REPLACE FUNCTION generate_sequence(_account_id integer, _sequence text) RETURNS TEXT AS $$
  _prefix text;
  _next_value text;
  _zero_pad integer;
  _value text;
  SELECT asq.prefix, asq.next_value::text, asq.zero_pad
  INTO _prefix, _next_value, _zero_pad
  FROM account_sequence asq
  WHERE asq.account_id = _account_id
    AND asq.sequence = _sequence;

  _value := _prefix || _next_value;

  IF _zero_pad IS NOT NULL THEN
    _value := lpad(_value, _zero_pad, '0');

  UPDATE account_sequence SET
    next_value = next_value + 1
  WHERE account_id = _account_id
    AND sequence = _sequence;

  RETURN _value;
$$ LANGUAGE plpgsql;

The generate_sequence() stored procedure takes two arguments: an account_id and sequence. My preference is to prefix variable names in stored procedures with an underscore so you can easily differentiate them from column names.

The stored procedure begins by loading in the necessary fields of the account_sequence record into several predefined variables.

Next, the return value is calculated and zero padded if necessary. The return value, stored in _value, is generated by concatenating the prefix of the account_sequence record with the next_value field.

Finally, the next value of the sequence is calculated and saved.

Using this sequence is simple.

SELECT generate_sequence(1039, 'invoice');
(1 row)

Another nice property of the generate_sequence() stored procedure is that it takes advantage of MVCC. If you require that the sequence be truly incremental, you can generate a sequence in a transaction and block other sequences from being generated until the transaction is committed or rolled back.

This stored procedure simply returns NULL if the account or sequence could not be found. You could easily update it to RAISE an error if the sequence did not exist.

And finally, because this code lives as a stored procedure, you can easily set it up as a pre-insert trigger on a table to generate sequence before a new row is inserted.

Building your own sequence generator may seem clunky, but it gives you a great deal of flexibility when dealing with records that need a human-visible unique identifier.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s