It is a class of concurrency control algorithms based on an optimistic approach that assumes conflicts are rare. Instead of using locking it uses timestamps to determine the serializability order.

Each transaction is assigned a unique fixed timestamp that is monotonically increasing and different schemes can assign timestamps at different times during the transaction or even multiple timestamps per transaction. If two transactions have < then the DBMS must ensure that the execution schedule is equivalent to the **serial schedule where appears before **

To allocate the timestamp, the DBMS can use the system clock (but what about synchronizations?), a logical counter (but what about overflows?) and hybrid approaches.

Optimistic concurrency control

OCC creates a private workspace for each transactions, performs all modifications and read there, and then before committing to the global workspace perform a validation step.

  • During the read phase anyread is copied into the workspace, and any object written is copied into the workspace and modified there, so that no other transactions can read the changes made by one transaction in its private workspace.
  • During the validation phase, the DBMS checks whether there are conflicts by looking at the transaction write set
  • During the write phase the changes from the private workspace are applied to the the database, otherwise the transaction is aborted and restarted

There are two approaches to validation: backward (from younger to older transactions) and forward. They share a similar approach in different order, by verifying among transactions in validation phase the following properties for two transactions and with <

  • completes all its phases before starts (serial ordering)
  • completes writes before starts writing, and doesn’t write any value read by
  • completes reads before complete reads, and doesn’t write to any object that is read or write by

Dynamic databases and the phantom problems

When transactions are allowed to add or delete data, a new class of problem emerge, since locking typically only locks existing records, not one that have been create. The following strategies exists to solve that:

  • Re-executing scans at commit time to verify they return the same result as a previous read in a transaction
  • Predictive locking: Predicate of queries are used to acquire locks so that no lock can be created or deleted if satisfy predicates of running queries (rarely used)
  • Index Locking: Index keys can be used to prevent phantoms by ensuring no new data can be fall within locked ranges:
    • Key-Value locks are individual key-values in an index, and can use virtual keys for non existing values
    • Gap locks locks on the gap following a key value
    • Key-Range locks locks on a range of keys from one existing key to the following one
    • Hierarchical locks allow a transaction to hold a broader key-range locks with different modesl reducing lock manager overhead

Important

In absence of a suitable index, transactions must lock every page in the table or the entire table to prevent changes that could lead to phantoms

Isolation levels

Serializability is useful because it allows programmers to ignore concurrency issues but enforcing it may allow too little parallelism and limit performance, so we might accept a weaker level of consistency to improve scalability.

Isolation levels control the extent that a transaction is exposed to the actions of other concurrent transactions, and in particular which anomalies can occur:

  • Dirty Read: Reading uncommitted data.
  • Unrepeatable Reads: Redoing a read retrieves a different result.
  • Lost Updates: Transaction overwrites data of another concurrent transaction.
  • Phantom Reads: Insertion or deletions result in different results for the same range scan queries.

Isolation Levels (Strongest to Weakest):

  1. SERIALIZABLE: No Phantoms, all reads repeatable, and no dirty reads. • Possible implementation: Strict 2PL + Phantom Protection (e.g., index locks).
  2. REPEATABLE READS: Phantoms may happen. • Possible implementation: Strict 2PL.
  3. READ-COMMITTED: Phantoms, unrepeatable reads, and lost updates may happen. • Possible implementation: Strict 2PL for exclusive locks, immediate release of the shared lock after a read.
  4. READ-UNCOMMITTED: All anomalies may happen. • Possible implementation: Strict 2PL for exclusive locks, no shared locks for reads.