Distributing Inverted Indexes

Inverted indexes map terms to document IDs and are used in full-text search systems like Elasticsearch or Lucene. Distribution can be achieved by partitioning either by term or by document.

  • Term Partitioning: Assign terms to shards based on a hash of the term. For example, “cat” might go to shard 1 and “dog” to shard 2. This allows queries for specific terms to be routed to a single shard.
  • Document Partitioning: Each shard maintains an index for a subset of documents. A query for multiple terms requires querying all shards and merging results.

Drawbacks

  • Term Partitioning: Common terms (e.g., “the”) can lead to hotspots.
  • Document Partitioning: Merging results from all shards introduces overhead for complex queries.

Distributing B-Tree Indexes

B-trees store keys in sorted order, making them suitable for range queries. However, their hierarchical structure complicates distribution.

  • Range Partitioning: Divide the key space into ranges and assign each range to a node. For example, keys 1–100 could go to node 1, 101–200 to node 2, and so on. Range queries are efficient as they target only the relevant nodes.
  • Distributed Leaf Nodes: Keep the tree’s root and internal nodes replicated across nodes, but distribute the leaf nodes. This reduces contention on higher levels of the tree.

Drawbacks

  • Range skew: Uneven key distribution leads to load imbalances.
  • Coordination overhead: Maintaining tree structure consistency across nodes is challenging.

Distributing Hash Indexes

Hash indexes map keys directly to values, making them simpler to distribute. Each key is assigned to a node based on a hash function.

  • Consistent Hashing: A popular method for distributing hash indexes. Nodes and keys are mapped to a circular hash space. Each key is assigned to the closest node in the clockwise direction. Virtual nodes are often used to improve load balancing.

Distributing indexes is not without its challenges:

  1. Multi-Shard Queries: Queries may require data from multiple shards, leading to higher latency.
  2. Data Rebalancing: Adding or removing nodes requires redistributing index entries.
  3. Consistency: Updates to indexes must be propagated across nodes while ensuring correctness under concurrent operations.

SSTables distribution

Cassandra and DynamoDB both use SSTables (Sorted String Tables) to store data within partitions, which are not distributed across nodes. This design ensures that data within a partition is sorted by specific columns—clustering keys in Cassandra and sort keys in DynamoDB. However, since SSTables are local to a single node, range queries are only supported within individual partitions, scoped by the partition key.

Tip

In Cassandra, clustering keys must be explicitly defined as part of the PRIMARY KEY, and their order dictates how rows are physically sorted while DynamoDB requires a sort key to be specified in its schema

Both systems rely on consistent hashing to distribute partitions across nodes, enabling scalability and high availability, but their reliance on local sorting limits range queries to the data stored in a single partition.