Introduction to databases

OLTP

Advanced databases CMU

  • Roaring Bitmaps Review
  • Bitweaving
  • Gandiva
  • CodecDB
  • Apache CarbonData
  • Arrow Feathers
  • SBoost Parquet
  • Review of Parquet official website ✅ 2025-03-15

Follow up of paper on columnar object store

  • Review why R/D is more expensive than List Offsets ✅ 2025-03-07
  • CodecDB paper: encoding selection when dictionary encoding fails
  • Column indexes and range filters
  • Zarr

Open stuff

AWS Reinvent talk

  • Pick DSQL and do not focus about scale
  • Almost all your Postgres code will work
  • Controversial decisions
  • MVCC
  • ACID is a Bad pun from an old paper but you didn’t say anything about OLTP
  • The log is the database
  • The storage layer is not responsible for durability or concurrency control(adjudicator)
    • We can store the storage in a much simpler way, since it is only doing a part
  • Internal log service built at Amazon (looks like the Meta one)
  • Initial veision: ADI, not ACID.
    • Adjudicator manages conflicts between transcations
    • Adjudicator are scaled, and there is a distributed commit protocol
    • The key space for the adjudicator are partitioned in a different way they are partiitioned in storage, to reduce how often they need to work together
    • Distributed commit protoco: a variant of two phase commit that is much more fault tolerant
  • Architecture:
    • Transaction and session router
    • Query processor within Firecracker (up to 10000s, ond ifferent machines in the fleet)
    • PGBouncer
    • Using firecracker is important because SQL is an arbitrary language
  • Custom time distribution network in US expose via AWS sync srerver
    • Used in DSQL, but available on EC2 instances
    • Based on atomic clocks
  • Check Firecracker, the open source micro-vm is hypervisor
  • Adds note on BitCask and WiscKey