Vertical partitioning

Vertical partitioning is a design pattern where data belonging to a single entity or object is divided among multiple DynamoDB items. These items are logically linked by a shared partition key, making easy to retrieve them collectively.

A typical design would use two items in the same table, divided into static attributes (slowly changing) and warm attributes. For example, for a Product, the PK could be ProductID, and for its static attributes could be ”!” while for the inventory which changes continuously it could be “inventory”

Root entity

Sparse indexes

One method to improve performance with sparse data is to implement a sparse index. A sparse index only contains a subset of items from the base table. In DynamoDB, sparse indexes are often implemented using Global Secondary Indexes (GSIs). DynamoDB will only write to a GSI if the chosen partition key and sort key attributes are present in the base table item. To implement a sparse index with GSI, you need to choose an attribute that is only present for the items you want to query.

Tip

In some cases, you may need to add new attributes to the base table to support this.

Write sharding

Write sharding is a mechanism to distribute a collection across a DynamoDB table’s partitions effectively. It increases write throughput per partition key by distributing the write operations for a partition key across multiple partitions. Write throughput for individual partition keys can therefore exceed the underlying partition capacity, and minimize capacity errors at the DynamoDB partition level.

You can implement write sharding from the client side by suffixing simple values to your partition key. Write sharding through suffixing is effective because even a one-byte change to the partition key produces a different output in the internal hash function, and places the item on a different partition

At runtime, you need to query manually on each shard like so:

def query_orders_by_customer(customer_id, shard_count=5):
    results = []
    for shard_suffix in range(shard_count):
        shard_key = f"{customer_id}#{shard_suffix}"
 
        # Query the GSI for this shard key
        response = table.query(
            IndexName='ShardKey-OrderDate-index',  # Replace with your GSI name
            KeyConditionExpression='ShardKey = :shard_key',
            ExpressionAttributeValues={':shard_key': shard_key}
        )
 
        # Collect items from this shard
        results.extend(response['Items'])
 
    return results
 
# Example usage
orders = query_orders_by_customer('customer123')
print(orders)