Your Feature Table Has 200 Columns. Iceberg Rewrites All of Them.

You ship a daily feature pipeline. It computes 8 new columns — a couple embeddings, a churn score, some engagement metrics — and writes them back to a table with 200+ columns and 3 billion rows. The job takes four hours because the merge-on-read strategy rewrites every row in every affected file, even though 192 columns didn't change. Your cloud bill reflects this enthusiasm.

Last week's Iceberg Summit in San Francisco put a concrete proposal on the table to fix this. Here's what it means for teams running ML pipelines on top of lakehouse storage.

The Write Amplification Problem Nobody Talks About

Iceberg v2 and v3 operate at row granularity. Copy-on-Write (CoW) rewrites entire data files when any column in a row changes. Merge-on-Read (MoR) defers the pain to query time, but compaction still eventually rewrites everything. For OLTP-style updates — change a customer's address, flip a status flag — this works fine. The update touches a small fraction of rows across a handful of files.

ML feature tables break this model. You're not updating rows selectively. You're computing a new value for every single row across 5-10 columns while leaving the other 190 columns untouched. A full-column update. The format treats it identically to a full-row update.

The math gets ugly fast. A 2 TB table in Parquet with Snappy compression: update 5% of columns, rewrite close to 2 TB. Daily. The S3 PUT costs alone stack up, and the Spark executors you're burning to produce identical bytes for 95% of the output aren't cheap either.

This is why some ML teams gave up on Iceberg-backed feature stores entirely. The workarounds are familiar and depressing: partition columns into separate tables, maintain sidecar files manually, run narrow tables that get JOINed at read time. All of which create their own maintenance nightmares when schema evolves or queries span the full feature set.

Two Competing Approaches from the Summit

At the Summit (April 8-9), Péter Váry and Anurag Mantripragada presented a proposal that's been circulating on the dev mailing list since March: efficient column-level updates, targeted for the V4 spec. The core idea is simple enough — write only the changed columns to separate files, stitch them back together at read time.

Where it gets interesting is the implementation split:

Iceberg-native column files. The table format itself tracks which column files belong to which base files. Metadata carries full visibility into what got updated and when. Non-overlapping column updates can run concurrently — your embedding pipeline and your churn-score pipeline don't block each other. The downside, as Anton Okolnychyi flagged on the mailing list, is reader complexity. Split planning across base and column files isn't trivial, and writers must keep row groups aligned.

Parquet-native column chunks. Push the problem into Parquet's internal column chunk structure. The table format barely changes — metadata stays minimal. Simpler to implement, but you're now coupled to Parquet's roadmap and lose some isolation guarantees.

Gábor Kaszab's proof-of-concept PR (#15445) suggests the metadata surface for the Iceberg-native path is manageable. But nobody's published read-path benchmarks yet, which is where the real skepticism should land.

Back-of-Napkin Cost Savings

The write-side economics are hard to argue with:

	Full-Row Rewrite (Today)	Column-Level Update (Proposed)
Table size	2 TB (Parquet, Snappy)	2 TB
Columns updated	8 of 200	8 of 200
Data written per run	~2 TB	~80 GB
Spark executor time	~45 min (i3.2xl × 20)	~5 min (estimated)
Monthly S3 write cost	~$300	~$12

A 25x reduction in write volume. The compute savings scale even harder if you're running this across dozens of feature tables, which most ML platforms do. One team I know runs 40+ feature pipelines nightly against wide tables — they'd go from a 6-hour window to under an hour.

The Read-Side Tax

Column-level updates aren't free at query time. When a scan hits the table, the engine needs to locate base files, find associated column files, align row groups, and merge them. Conceptually similar to how MoR handles deletion vectors today, but across columns instead of rows.

For wide analytical queries, the overhead could be noticeable. For narrow ones — "give me user_id and churn_score" — the engine might actually read fewer bytes, since it can grab the column file directly without touching the base file for updated columns. The performance profile depends heavily on query patterns and how engines implement the stitching logic.

The V4 proposal explicitly preserves column statistics and pruning capabilities in column files, so predicate pushdown should still work. But "should" and "does" are different conversations, and we won't know until Spark, Trino, and Flink ship their implementations.

What Else Landed at the Summit

The column updates proposal didn't happen in isolation. Two related developments from the same week:

Parquet ALP encoding got accepted. Adaptive Lossless floating-Point compression splits exponents and mantissas, improving compression ratios for float-heavy data. If your feature table is packed with float64 embeddings and model scores, this directly reduces the size of those column files the V4 proposal would generate.

V4 single-file commits advanced. Today, every Iceberg commit writes a new metadata.json file. The single-file commit proposal would fold commit metadata into a single file, reducing commit latency and storage overhead. This matters for column-level updates because frequent partial-column writes mean more commits.

Together, column-level updates + ALP compression + single-file commits could reshape the economics of Iceberg-backed feature stores. Separately, each is a modest improvement. Stacked, they address the three pain points ML teams actually complain about: write cost, storage bloat, and commit overhead.

The Pragmatic Take

The V4 spec isn't shipping tomorrow. The community hasn't even settled on the Iceberg-native vs. Parquet-native approach yet, and engine support will lag the spec by months. If you're building a new feature store this quarter, don't architect around column-level updates.

What you can do: follow issue #15146 and the mailing list thread. If you're stuck with the rewrite problem today, the least-bad workaround is still partitioning your wide table by column group — keep your slowly-changing dimensions in one table, your daily-computed features in another, and JOIN at read time. It's ugly, it complicates schema management, and it defeats some of the purpose of having a unified feature table. But it cuts your nightly write volume by an order of magnitude right now, without waiting for a spec revision.

The real question is whether column-level updates will be a V4 launch feature or a "V4.1, eventually" addition. Based on the Summit energy and the active POC work, I'd bet on the former — but I've been wrong about Iceberg timelines before.

#The Write Amplification Problem Nobody Talks About

#Two Competing Approaches from the Summit

#Back-of-Napkin Cost Savings

#The Read-Side Tax

#What Else Landed at the Summit

#The Pragmatic Take