Apache Iceberg Metadata Tables

Apache Iceberg expose metadata as traditional SQL tables, making easier to investigate issues, proactively identified improvements, etc. The important tables are:

History
Metadata log entries
Snapshots
Files
Manifest

Note

This is a significant improvement compared to formats such as Hive, you’d have to depend on a particular engine implementing one-off commands, such as SHOW PARTITIONS

History table

The history metadata table records the table’s evolution. Each row is composed of four fields:

made_current_at represents the exact timestamp when the corresponding snapshot was made the current snapshot.
snapshot_id field serves as a unique identifier for each snapshot.
parent_idthe unique ID of the parent snapshot of the current snapshot.
is_current_ancestor: indicates whether a snapshot is an ancestor of the table’s current snapshot. It differentiates snapshots that are part of the table’s present state lineage and those that have been invalidated from table rollbacks. The the history metadata table is useful for data recovery and version control as well as to identify table rollbacks.

-- Spark SQL
SELECT * FROM my_catalog.table.history;
-- Dremio
SELECT * FROM TABLE(table_history('catalog.table'))
-- Trino
SELECT * FROM "table$history"

Metadata log entries

The metadata_log_entries metadata table keeps track of the evolution of the table by logging the metadata files generated during table updates:

The timestamp field records the exact date and time when the metadata was updated.
the file field indicates the location of the datafile that corresponds to that
The latest_snapshot_id field provides the identifier of the most recent snapshot at the time of the metadata update.
the latest_schema_id field contains the ID of the schema being used when the metadata log entry was created.
the latest_sequence_number field signifies the order of the metadata updates. It’s an incrementing count that helps track the sequence of metadata changes over time.

Example query: find the latest snapshot per each schema

WITH Ranked_Entries AS (
	SELECT
		latest_snapshot_id,
		latest_schema_id,
		timestamp,
		ROW_NUMBER() OVER(PARTITION BY latest_schema_id ORDER BY timestamp
	DESC) as row_num
	FROM
		catalog.table.metadata_log_entries
	WHERE
		latest_schema_id IS NOT NULL
)
SELECT
	latest_snapshot_id,
	latest_schema_id,
	timestamp AS latest_timestamp
FROM
	Ranked_Entries
WHERE
	row_num = 1
ORDER BY
	latest_schema_id DESC;

Snapshot

The snapshots metadata table is essential for tracking dataset versions and histories:

the committed_at field signifies the precise timestamp when the snapshot was created
the snapshot_id field is a unique identifier for each snapshot.
the operation field lists a string of the types of operations that occurred, such as APPEND and OVERWRITE.
The parent_id field links to the snapshot ID of the snapshot’s parent
the manifest_list field offers detailed insights into the files comprising the snapshot.
the summary field holds metrics about the snapshot, such as the number of added or deleted files, number of records, and other statistical data

Example query: Summarize operations by day

SELECT
	operation,
	COUNT(*) AS operation_count,
	DATE(committed_at) AS date
FROM
	catalog.table.snapshots
GROUP BY
	operation,
	DATE(committed_at)
ORDER BY
	date;

Snapshots committed_at vs history made_current_at

When new data is added or new files are added to the table, a snapshot is committed and immediately made the current state of the table. However, there are scenarios where the two diverge:

Rollbacks would modify the history table made_current_at, which wouldn’t correspond anymore to any committed_at of a snapshot
Concurrent operations: one snapshot can be committed without ever making it current

Files

The files metadata table showcases the current datafiles within a table:

content, represents the type of content in the file, with a 0 signifying a datafile, 1 a position delete file, and 2 an equality delete file.
file_path gives the exact location of each file.
file_format field indicates the format of the datafile (Parquet, Avro, ORC)
thespec_id field corresponds to the partition spec ID that the file adheres to
the partition field is a struct that describes the partition key value (i.e. {1999-01-01, 01})
the record_count field reports the number of rNaNecords contained within each file, giving a measure of the file’s data volume.
thefile_size_in_bytes field provides the total size of the file in bytes whilecolumn_sizes furnishes the sizes of the individual columns.
the value_counts, null_value_counts, and nan_value_counts fields provide the count of non-null, null, and NaN (Not a Number) values, respectively, in each column.
lower_bounds and upper_bounds fields hold the minimum and maximum values in each column, providing essential insights into the data range within each file.
The key_metadata field contains implementation-specific metadata, if any exists.
The split_offsets field provides the offsets at which the file is split into smaller segments for parallel processing.
The equality_ids and sort_order_id fields correspond to the IDs relating to equality delete files, if any exist, and the IDs of the table’s sort order, if it has one.

Note

column_sizes, value_counts, null_value_counts, nan_value_counts, lower_bounds and upper_bounds columns are all encoded as struct where the key represents the column index

Example query: Identifying candidate for compaction

SELECT
	partition,
	COUNT(*) AS num_files,
	AVG(file_size_in_bytes) AS avg_file_size
FROM
	catalog.table.files
GROUP BY
	partition
ORDER BY
	num_files DESC,
	avg_file_size ASC

Example query: Identifying partitions with data quality problems

SELECT
	partition, file_path
FROM
	catalog.table.files
WHERE
	null_value_counts['3'] > 0
GROUP BY
	partition

Example query: Listing files from previous snapshots

SELECT file_path, file_size_in_bytes
FROM catalog.table.files
VERSION AS OF <snapshot_id>;

Manifests

The manifests metadata table details each of the table’s current manifest files.

The path field provides the filepath where the manifest is stored
The length field shows the size of the manifest file.
The partition_spec_id field indicates the specification ID of the partition that the manifest file is associated with
Theadded_snapshot_id field provides the ID of the snapshot that added this manifest file
added_data_files_count, existing_data_files_count, and deleted_data_files_count are self explanatory
the partition_summaries field is an array of field_summary structs that summarize partition-level statistics: contains_null, contains_nan, lower_bound, and upper_bound.

Warning

contains_nan could return null when the information isn’t available from the file’s metadata, which usually occurs when reading from a V1 Table

Tip

Statistics by default are collected on 32 columns

Example query: Searching manifests file shorter than average

WITH avg_length AS (
	SELECT AVG(length) as average_manifest_length
	FROM catalog.table.manifests
)
SELECT
	path,
	length
FROM
	catalog.table.manifests
WHERE
	length < (SELECT average_manifest_length FROM avg_length);

Example query: Count of many files are added per snapshot

SELECT
	added_snapshot_id,
	SUM(added_data_files_count) AS total_added_data_files
FROM
	catalog.table.manifests
GROUP BY
	added_snapshot_id

Edmondo's Vault

Explorer

Apache Iceberg Metadata Tables

History table

Metadata log entries

Snapshot

Snapshots committed_at vs history made_current_at

Files

Manifests

Graph View

Table of Contents

Backlinks