MySQL vs Postgres
A factual mapping for someone who thinks in Postgres and needs to work with MySQL daily.
Storage architecture
The fundamental difference between the two engines is how data lives on disk.
Postgres uses heap storage: table data sits in an unordered heap, and every index (including the primary key index) points to a physical tuple ID (TID) in the heap. A secondary index lookup is one hop: index entry → TID → heap page. Postgres uses HOT (Heap-Only Tuple) updates to avoid index maintenance when only non-indexed columns change. Dead tuples — rows that have been updated or deleted but whose old versions remain physically on disk because active transactions might still need to see them — accumulate and are cleaned by VACUUM (Postgres’s background maintenance process that reclaims space from dead tuples and updates statistics).
InnoDB uses a clustered index: the table data itself is stored in the primary key’s B+tree. Rows are physically ordered by PK. There is no separate heap. Secondary indexes store the primary key value (not a physical pointer), so a secondary index lookup is two hops: secondary index → PK value → clustered index → row. There is no VACUUM. Old row versions live in the undo tablespace and are cleaned by a purge thread (InnoDB’s background worker that scans the undo log and discards row versions no longer needed by any active transaction). See InnoDB Architecture.
Covering indexes matter more in MySQL
Because secondary index lookups require a second B+tree traversal (the “bookmark lookup”), covering indexes that include all needed columns in the index itself avoid the second hop entirely. In Postgres, the cost difference is smaller because heap access is already a single direct lookup.
Type system differences
| Concept | Postgres | MySQL (InnoDB) |
|---|---|---|
| Auto-incrementing PK | SERIAL / BIGSERIAL (sequence-backed) | AUTO_INCREMENT (per-table counter, stored in engine) |
| Variable-length text | text and varchar(N) are identical in performance | VARCHAR(N) is inline (up to ~8KB row limit). TEXT is stored off-page (overflow). Prefer VARCHAR in MySQL when values fit. |
| Boolean | Native boolean type | TINYINT(1) — just an integer with a display hint |
| Enum | Stored as string internally. Sortable alphabetically. | Stored as integer internally. Sorts by definition order, not alphabetically. Adding values requires ALTER TABLE. |
| JSON | jsonb — binary format, GIN-indexable, fast operators (@>, ?, ->>). | JSON — validated on write but stored as text internally. Indexing requires creating a generated column and indexing that. |
| UUID | Native uuid type. gen_random_uuid() for generation. | No native type. Store as CHAR(36) or BINARY(16). UUID() returns a string. UUID_TO_BIN() (8.0+) converts to binary. BIN_TO_UUID() reverses. |
MySQL ENUM ordering
ORDER BY enum_columnsorts by the position in the enum definition, not alphabetically. If you add a new value at the end, it sorts last regardless of its name. This surprises every Postgres user.
Query differences for Postgres users
RETURNING clause — Postgres supports INSERT ... RETURNING *, UPDATE ... RETURNING *, DELETE ... RETURNING *. MySQL has no equivalent. Use LAST_INSERT_ID() after INSERT, or issue a separate SELECT.
CTEs (Common Table Expressions) — Both support CTEs, the named subquery blocks introduced with WITH. Materializing a CTE means computing it once, storing the result in a temporary buffer, and reusing that buffer — as opposed to inlining it (substituting the CTE definition wherever it is referenced, letting the optimizer fold it into the outer query). Postgres (since v12) inlines CTEs by default unless MATERIALIZED is specified. MySQL materializes CTEs by default, which can hurt performance when the CTE is used once and filtering happens outside it.
UPSERT — Postgres: INSERT ... ON CONFLICT (column) DO UPDATE SET .... MySQL: INSERT ... ON DUPLICATE KEY UPDATE .... MySQL also has REPLACE INTO, but that performs DELETE + INSERT (fires delete triggers, resets auto-increment), so it is not a true upsert.
Window functions — Both support them. MySQL added window functions in 8.0.
Partial indexes — Postgres: CREATE INDEX ... WHERE condition. MySQL does not support partial indexes. The workaround is a generated column with a conditional value, then index that. See Index Types for Postgres-specific index types.
Expression indexes — Postgres: CREATE INDEX ... ON (expression). MySQL requires creating a generated (virtual or stored) column and indexing that column.
EXPLAIN — Postgres returns a tree structure with estimated costs, rows, and width. MySQL returns a tabular format (one row per table in the join). EXPLAIN ANALYZE (MySQL 8.0.18+) adds actual execution timing similar to Postgres, but the output format is an iterator-based tree rather than the traditional tabular format.
Concurrency model
Both use MVCC, but the implementation differs significantly.
| Aspect | Postgres | MySQL (InnoDB) |
|---|---|---|
| Default isolation | READ COMMITTED | REPEATABLE READ |
| Where old versions live | In the heap (dead tuples) | In undo tablespace |
| Cleanup mechanism | VACUUM (background autovacuum) | Purge thread (undo log cleanup) |
| Bloat risk | Table bloat from dead tuples if VACUUM falls behind | History list length growth if purge falls behind |
| Phantom protection | Predicate locks (locks on a range of rows matching a predicate, not individual rows — part of SSI: Serializable Snapshot Isolation) | Gap locks at REPEATABLE READ |
| Deadlock detection | Periodic check (configurable interval, default 1s) | Active wait-for graph (near-instant detection) |
Default isolation level matters
MySQL’s default
REPEATABLE READmeans a transaction sees a snapshot from its first read, for the entire transaction. Postgres’sREAD COMMITTEDmeans each statement sees the latest committed data. This causes subtle behavior differences with long-running transactions.
Multi-tenancy
Postgres has Row-Level Security (RLS) — engine-enforced row filtering that cannot be bypassed by application code. Set a session variable (set_config('app.current_tenant', $1, true)), define a policy, and the engine adds the filter to every query automatically.
MySQL has no RLS. Multi-tenancy must be enforced in application code by adding WHERE tenant_id = ? to every query. Common strategies:
- Application-level filtering — every query includes the tenant predicate. One missed WHERE clause is a data leak.
- Separate database per tenant — strong isolation but operational nightmare at scale (connection management, schema migrations across N databases, monitoring).
- Query rewriting via ProxySQL — ProxySQL is an open-source MySQL proxy that can intercept and rewrite queries before they reach the server. Used here to inject tenant filters. Fragile and hard to audit.
No safety net in MySQL
In Postgres, forgetting the tenant filter is harmless if RLS is enabled — the engine filters for you. In MySQL, forgetting the tenant filter returns all tenants’ data. Defense-in-depth requires framework-level middleware or ORM hooks to inject the filter, but these are not engine-enforced.
Extensions vs plugins
Postgres has a rich extension ecosystem. CREATE EXTENSION registers an extension into the database catalog, giving access to new types, functions, operators, and index methods. Examples: pgvector, PostGIS, pg_trgm, pg_stat_statements, ParadeDB. See Extensions and Grants for the lifecycle and trust model.
MySQL’s plugin system is more limited. Plugins cover storage engines, authentication, audit logging, and a few other categories. There is no equivalent of CREATE EXTENSION that adds new types and operators to the SQL layer. Advanced features often require:
- Percona Server — a MySQL-compatible fork by Percona that adds a thread pool (serving many connections with fewer OS threads), enhanced audit plugin, and encryption improvements.
- MariaDB — a community MySQL fork with additional storage engines: ColumnStore (a columnar store for analytics, separate process integrated via MariaDB connector) and Spider (a storage engine that shards data across multiple MySQL/MariaDB servers, acting as a transparent federation layer). Also adds temporal tables (system-versioned tables that track row history automatically) and other SQL extensions.
- ProxySQL — external MySQL-aware proxy for query routing, load balancing, caching, and rewriting. The standard choice for adding a connection pool and query manipulation layer in front of MySQL clusters.
Replication
| Aspect | Postgres | MySQL |
|---|---|---|
| Mechanism | WAL-based streaming replication | Binlog-based replication |
| Logical replication | Supported (Postgres 10+), table-level granularity | Supported via binlog filtering |
| Replication format | WAL is physical (byte-level) for streaming; logical decoding for logical rep | Row-based, statement-based, or mixed (configurable) |
| Multi-source | Not supported natively | Replica can read from multiple primaries |
| Multi-primary | Not built-in (use Citus — a Postgres extension by Microsoft that shards tables across multiple Postgres nodes for horizontal write scaling — or BDR — Bi-Directional Replication, a Postgres extension by EDB that enables active-active multi-primary clusters) | Group Replication (built-in since 5.7) |
MySQL’s replication is more mature and flexible, with decades of production use at scale. Postgres is catching up with logical replication but still lacks multi-source and native multi-primary.
Operational differences that bite daily
Online DDL (Data Definition Language) — MySQL ALTER TABLE supports ALGORITHM=INSTANT (metadata-only, nearly instant) and ALGORITHM=INPLACE (rebuilds table but allows concurrent DML — Data Manipulation Language: INSERT/UPDATE/DELETE) for many operations. Postgres has strong online DDL: ADD COLUMN with a non-volatile DEFAULT is instant since v11. Both databases still require caution with operations that rewrite the table (changing column types, adding indexes on large tables).
Connection model — Postgres forks a process per connection. Each connection consumes ~10MB. Connection pooling (PgBouncer, pgcat — a newer Postgres pooler written in Rust with multi-tenant features) is nearly mandatory in production. MySQL uses a thread per connection, which is lighter. Percona Server and MySQL Enterprise include a built-in thread pool.
Backup — Postgres: pg_dump (logical — SQL dump, portable), pg_basebackup (physical — binary copy of the data directory), Barman (Backup and Recovery Manager — a backup orchestration tool for Postgres from EnterpriseDB), pgBackRest (an open-source Postgres backup tool with WAL archiving, incremental backups, and parallel restore). MySQL: mysqldump (logical, locks tables by default), xtrabackup (Percona XtraBackup — a hot physical backup tool for MySQL/InnoDB that copies data files while the server runs, without locking), MySQL Enterprise Backup. xtrabackup is the standard for production MySQL backups without downtime.
Connection pooling is table stakes for Postgres
If you come from MySQL and set
max_connections = 500in Postgres, you will run out of memory. Postgres connections are heavyweight OS processes. Use PgBouncer in transaction mode or pgcat, and keepmax_connectionsmuch lower (50-200 typical).
See also
- InnoDB Architecture — MySQL storage engine internals
- Index Types and Storage Engines — MySQL index types and storage engine comparison
- Row-Level Security — Postgres’s engine-enforced row filtering
- Extensions and Grants — Postgres extension ecosystem
- VACUUM — Postgres dead tuple cleanup