Skip to content

Iceberg: parquet footer cold-start latency on first query after metadata refresh #73401

@eshishki

Description

@eshishki

Problem

After CachingIcebergCatalog refreshes a table's metadata (SQL
REFRESH EXTERNAL TABLE, the periodic background refresh, or the
catalog's own refresh hooks), the data files that the new snapshot
references are not yet present in BE/CN block_cache. The first user
query on the table therefore pays a S3 round-trip (~50–150 ms,
sometimes more depending on region and bucket warmup) per file
just to read the parquet/orc footer. Multiplied across the files in
a scan plan, this dominates first-query latency on cold tables and
shows up as a visible "first run is slow" cliff that disappears on
subsequent runs.

Why it bites

The CACHE SELECT path already exists and is the documented way to
warm block_cache against a table, but:

  1. It is a user-driven SQL statement — there is no built-in hook that
    makes the warm-up happen automatically when metadata changes.
  2. Even when issued manually, the CACHE SELECT scanner today does not
    wrap the underlying _file in a populating CacheInputStream, so
    the parquet reader->init footer read goes straight to raw storage
    on every CACHE SELECT — only the column ranges explicitly fed to
    CacheSelectInputStream via _write_disk_ranges end up in
    block_cache. The footer is exactly the part that drives the
    per-file S3 round-trip in subsequent real queries.

The net effect is that footers are paid for at user-query time even
on warehouses where operators are happy to spend background IO to
hide that latency from interactive workloads.

Proposed direction

  • Add an opt-in (default false) hook in CachingIcebergCatalog that
    fires a CACHE SELECT against the freshly-refreshed table on a
    per-catalog background executor, with duplicate-trigger coalescing.
  • Teach HdfsScanner::create_random_access_file to wrap _file in
    a regular CacheInputStream for cache_select mode too, so the
    footer read populates block_cache during CACHE SELECT (this is a
    generic fix, not feature-gated).
  • Add a footer-only mode to CacheSelectScanner so the new hook can
    stop scanning after reader->init (footer is already warmed) and
    skip column data and Iceberg delete-file fetches. Exposed only via
    an internal INVISIBLE session variable so user-issued CACHE
    SELECTs do not silently degrade to footer-only.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions