Counting on Dynamo

Inspired by a useful and succinct answer on StackOverflow

See my DynamoCounter repository for runnable samples of the code in this post.

Leading note: DynamoDB’s Global Tables work a little too differently for the information here to be directly applicable. We’ll discuss why at the end, but if you’re looking for a solution there, this is not it.

Auto-incrementing numbers are common in database engines - usually provided as types/objects/functions - and are often useful for safe and automatic primary key values (e.g. postgres serial types, rowid default primary key in sqlite). These are reliable ways to guarantee a uniqueness to any resulting value as the database engine will guarantee no two processes will receive the same value. DynamoDB has no such type, but there are techniques you can use to achieve something to the same effect.

Two ways to achieve this are with an atomic counter or with optimistic locking. The articles on both are great and led me to my first real-world implementation, but below I’ll talk a bit about both and point to code from my samples. The code below is all C# but since it is very plain DynamoDB SDK usage the concepts should read plainly and be easily transferred.

The examples below assumes a schema where pk is the partition key and there is no sort key. The full counter item will be intialised as follows:

{
  "pk": "counter",
  "count_value": 0
}

Atomic Counters

Atomic counters are numbers which will always be incremented and returned as a result of a simple isolated operation, achieved here with an Update request. The Update request must use an expression declaring the increment, and specify ALL_NEW as the return value parameter. Dynamo will ensure that the item is read, modified and returned in a single operation such that two concurrent processes will always receive different return values and update a different value.

var count = await this.dynamo.UpdateItemAsync(new UpdateItemRequest
{
    TableName = tableName,
    Key = new Dictionary<string, AttributeValue> {["pk"] = new() {S = "counter"}},
    UpdateExpression = "ADD #count :increment",
    ExpressionAttributeNames =
        new Dictionary<string, string> {["#count"] = "count_value"},
    ExpressionAttributeValues =
        new Dictionary<string, AttributeValue> {[":increment"] = new() {N = "1"}},
    ReturnValues = ReturnValue.ALL_NEW
});

Note the UpdateExpression and ReturnValues here. Once this is executed the count response will include an attribute for each updated attribute here (count_value) with the new value (the number we just incremented to). If we were to use this to set a value we can do so like this:

await this.dynamo.PutItemAsync(new PutItemRequest
{
    TableName = tableName,
    Item = new Dictionary<string, AttributeValue>
    {
        ["pk"] = new() {S = count.Attributes["count_value"].N},
        ["some_attribute"] =
            new() {S = "This item guaranteed to have unique PK! Woohoo!"}
    }
});

Some considerations with this approach:

The number will be incremented, always, before you can use it. It is no good for strictly sequential numbers, and an application failure can cause increments to go unused. This might not matter to you because numbers can be really big and data can be really small, but consider if your data is bigger than numbers or if you cannot tolerate gaps in your sequence.
The way that DynamoDB transactions work means it is not possible to use an atomic counter in the same transaction as the item update that uses the value. This is only really a concern if you have reasons to not arbitrarily update the number.

Atomic counters are a very lightweight and safe choice if you really don’t mind “wasting” numbers as they do not require retry mechanisms to guarantee exclusive use of the value. Safety, uniqueness and simplicity matter more to the use cases where I’ve needed this than sequence gaps do, so atomic counters are what I’d prefer in a vacuum.

Optimistic Locking

The optimistic locking strategy involves performing an increment in application code and optimistically writing that value back to the source of truth, with a condition that fails if it has been modified since you read it. Essentially, “Read the number 1, try to write 2, fail if the source value is not still 1”. This will prevent application code ever actually using a value unless it has exclusivity over it though to be effective you will need a retry mechanism around it as locking failures should be common.

var currentCountItem = await this.dynamo.GetItemAsync(new GetItemRequest
{
    TableName = tableName,
    Key = new Dictionary<string, AttributeValue> {["pk"] = new("counter")},
    AttributesToGet = new List<string> {"count_value"}
});
var currentCount = currentCountItem.Item["count_value"].N;

await this.dynamo.UpdateItemAsync(new UpdateItemRequest
{
    TableName = tableName,
    Key = new Dictionary<string, AttributeValue> {["pk"] = new() {S = "counter"}},
    UpdateExpression = "ADD #count :increment",
    ConditionExpression = "#count = :current_count",
    ExpressionAttributeNames =
        new Dictionary<string, string> {["#count"] = "count_value"},
    ExpressionAttributeValues = new Dictionary<string, AttributeValue>
    {
        [":increment"] = new() {N = "1"},
        [":current_count"] = new() {N = currentCount}
    },
});

The simple sample above illustrates two operations to achieve this - first we get the value with no particular fuss or flair. Next, we try to write that back, incremented. If we succeed, we have exclusive use of the currentCount we read out before reading, and no other process can retrieve that same value. Pay particular attention to the ConditionExpression as that is the expression which guarantees the operation will fail if the count_value was modified since we read it.

On success, we can safely use the value:

await this.dynamo.PutItemAsync(new PutItemRequest
{
    TableName = tableName,
    Item = new Dictionary<string, AttributeValue>
    {
        ["pk"] = new() {S = currentCount},
        ["some_attribute"] =
            new() {S = "This item guaranteed to have unique PK! Woohoo!"}
    }
});

This approach is more complex than atomic counters and has some other considerations:

Retry mechanisms must be used as optimistic locking failures should be expected in the ordinary course of use.
With “wasted requests” being ordinary with optimistic locking, this can incur “wasted” cost.
Contention over a single counter item can have a very adverse effect on your application depending on scale.
Updating the counter and using the value in another item can be done in a transaction, writing an item using the value and updating the counter if locking was successful, or aborting the transaction if locking failed. This can achieve no “wasted values” as no chance of a client application failure between the counter update and other writes can occur. Note, transactions cost more WCU than ordinary writes.
A PutItem request can be used instead of an UpdateItem request easy enough if desirable. See the code samples for more detail.

Final notes

I have implemented an atomic counter in production code so far. Needing to do so led me to a stack overflow post linking to AWS' two fantastic articles on these techniques. I wanted to tie them both together with runnable samples so that others may benefit from a definitive implementation and notes of additional considerations.

In no particular order, here’s some additional context and thoughts on the topics/post above:

Code samples

The repository linked above expands with additional use cases demonstrating how concurrent operations behave and how you would achieve transactional optimistic locking, as well as optimistic locking via Put/Update. All samples should be runnable and inspectable.

Here’s the link to the repository again for convenience.

Consistency

DynamoDB thrives on some eventual consistency and one subtlety in the implementations above is that the optimistic locking sample does not use a strongly consistent read for brevity (as it’s not strictly necessary with a retry mechanism in place). It’s worth noting that conditional writes are strongly consistent by default in dynamo which is what gives all of these techniques some certainty, the condition expression will not falsely pass due to eventual consistency of other writes (Although, see global tables below). The official docs cover this with an example.

Optimistic locking, the easy way

My samples demonstrate the raw techniques and simple SDK usage, but AWS does provide some different DynamoDB usage models that have some inbuilt optimistic locking using a “version” attribute. The .NET object persistence model and the Java DynamoDBMapper both support this innately. I tend to use the raw SDK or the .NET Document Model, so haven’t used these myself.

Versus traditional database engine implementations

In both strategies here, the onus really is on the application code to do the right thing as nothing is enforced at a database schema level, but at an individual request level, meaning that it is rather easy for a rogue client to misuse your counter and have your well-behaved clients suffer. In the worst case, if some client was to reset a counter to 0, the above techniques would both behave as if everything was fine.

Using auto-incrementing numbers as partition keys

Actually, using auto-incrementing numbers as partition keys doesn’t often lean into a good way to use DynamoDB. Information in Dynamo is cheapest to access by direct key lookup rather than iterations and queries, making uniquely generated numbers a fairly odd choice for a partition key. That said, sometimes it is fine and counters can also have utility outside of partition key generation.

Recommended partition key design suggests using high-cardinality attributes from your item, and perhaps even a composite key of attributes. As always, your access patterns should be at the forefront of your key design, and rarely is that “I want to get item 12”.

DynamoDB Global Tables

The nature of global tables makes them even more eventually consistent than a traditional DynamoDB table as they replicate data globally. The strategies used to achieve this include a “Last Writer Wins” conflict resolution strategy. This does not play well with the techniques above - both of which rely on a successful write being a guarantee of correctness, and both of which rely on update expressions to be strongly consistent. In a global table, the first writer may increment a number, be confident it has exclusive use of the value it holds, only for another writer to operate on an older version of the number and retrieve the same value.