26 Feb 2009

Datastore: use entity groups and transaction to implement sequence

On most RDBMS, there is an auto-increment type such as SERIAL, or SEQUENCE. It helps generate unique number IDs while inserting new rows.  Unfortunately, there is no such a thing in Datastore, and it is understandably difficult to provide such a mechanism in a distributed repository environment.  However, we sometimes need unique IDs that can be used to make a composite index.  In this case, we have to implement our own sequence.

First we define a class for the unique index counter like SEQUENCE in Oracle.

class Sequence(db.Model):
    nextval = db.IntegerProperty(default=0)

Then, we define a kind that makes use of this sequence.

class Entry(db.Model):
    index = db.IntegerProperty(required=True)
    name = db.StringProperty()
    ... ...

When a new entity of Entry is created, it must be added to the Sequence entity group, and the index must be the incremented nextval of the Sequence entity.  This value will then be used to label the index of the new Entry element.  We can put these two operations into a transaction.

def trans(seqkey):
    s = Sequence.get(seqkey)
    s.nextval += 1
    s.put()
    entry = Entry(parent=Sequence, index=s.nextval)
    entry.put()

seqkey = 'my_egroup_key'
db.run_in_transaction(trans, seqkey)

Entity groups should not be too large so that the whole lot can be distributed evenly.  A single sequence counter will become the bottleneck in this respect.  It is better that the Sequence entities be separated into multiple groups by some kind of boundary and the Entry entities follow these sequence entities.  This requires careful design of the data model.

No comments:

Post a Comment