Iceberg: read PartitionStatisticsFile so last_updated_at / last_updated_snapshot_id survive expire_snapshots

## Enhancement

### Problem

For Iceberg tables, `last_updated_at` and `last_updated_snapshot_id` per partition are not stored in any manifest — Iceberg derives them on the fly by walking `ManifestEntry.snapshot_id()` for every live data/delete file in a partition and looking up `Snapshot.timestampMillis()` from `TableMetadata.snapshots()`.

This works only as long as the snapshots that originally wrote each partition are still in `TableMetadata.snapshots()`. After `expire_snapshots` (or `expire_older_than`) drops those snapshots:

- `Snapshot snap = table.snapshot(entry.snapshotId())` returns `null` for the expired writers
- `last_updated_at` / `last_updated_snapshot_id` for those partitions become unrecoverable from manifests alone

Iceberg introduced [`PartitionStatisticsFile`](https://iceberg.apache.org/spec/#partition-statistics) (referenced from `TableMetadata.partition-statistics`) precisely to persist these per-partition counters — including `last_updated_at` and `last_updated_snapshot_id` — across snapshot expiration. Engines that read it can keep reporting correct values; engines that don't, lose data after the first `expire_snapshots`.

### Impact in StarRocks today

StarRocks currently never reads `PartitionStatisticsFile`. After `expire_snapshots` on an Iceberg table:

1. **`SELECT * FROM <iceberg_table>$partitions`** — `last_updated_at` and `last_updated_snapshot_id` columns become `NULL` (or 0 / -1 depending on path) for any partition whose writing snapshot has been expired. Users querying this metadata table for ops/audit see incomplete data.

2. **MV auto-refresh** — `IcebergCatalog.getPartitions` populates `Partition.modifiedTime` from the same manifest walk. `DefaultTraits.getUpdatedPartitionNames` compares `modifiedTime` against `MV.BasePartitionInfo.lastRefreshTime` to decide which partitions need refresh. After expire:
   - Partitions with expired writers get `modifiedTime = 0` (or earlier wall-clock than the MV refresh) → MV treats them as "not updated" and skips them even if their data actually changed.
   - In the opposite direction, any refresh strategy that hashes manifest contents (e.g. `getPartitionVersion`) sees `dvCount` / `position_delete_*` shift after table maintenance (`rewrite_files`, DV compaction) and over-refreshes partitions whose logical data is unchanged.

Combined effect: on long-lived Iceberg tables with periodic `expire_snapshots`, the MV refresh decision is wrong in both directions, and `$partitions` shows wrong audit data.

### Proposed solution

Teach the FE Iceberg connector to read `PartitionStatisticsFile`:

- **`$iceberg_partitions_table` (BE scanner)** — when the target snapshot is reachable from `partition-statistics`, build the partition row from `PartitionStats` (which carries persisted `last_updated_at` / `last_updated_snapshot_id`) merged with manifests added between the stats snapshot and the target snapshot.
- **`IcebergCatalog.getPartitions` (FE)** — same fast path so MV refresh, predicate listing, and pinned-snapshot reads use the persisted counters. Fingerprint must stay bit-equivalent with the manifest-scan path so cached partition entries from either path remain interchangeable.

Both paths should share one implementation (stats file reader + incremental manifest merge) since they consume the same Iceberg API.

A kill switch (session variable) lets operators disable the fast path if a stats file turns out to be corrupted or out of sync with the table.

### Why now

Without this, the only way to keep `$partitions` and MV refresh correct on Iceberg tables is to never expire snapshots — which defeats the purpose of `expire_snapshots` (storage reclamation, metadata size control). Tables that rely on Iceberg's own [`compute_partition_stats` action](https://iceberg.apache.org/javadoc/latest/org/apache/iceberg/spark/actions/ComputePartitionStats.html) to populate the stats file already pay the write cost but get no read-side benefit in StarRocks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iceberg: read PartitionStatisticsFile so last_updated_at / last_updated_snapshot_id survive expire_snapshots #73399

Enhancement

Problem

Impact in StarRocks today

Proposed solution

Why now

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Iceberg: read PartitionStatisticsFile so last_updated_at / last_updated_snapshot_id survive expire_snapshots #73399

Description

Enhancement

Problem

Impact in StarRocks today

Proposed solution

Why now

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions