Spark's Real-Time Mode Promises 5ms Latency. Your Delta Lake Sink Doesn't Care.

Every conference slide deck from the last six months has featured the same number: 5 milliseconds. That's the end-to-end latency Spark's new Real-Time Mode delivers for stateless structured streaming queries. It's a real number, measured on real hardware, and it represents genuine engineering work from the Spark community. It also has almost nothing to do with how most data engineers actually use Spark Structured Streaming.

The Gap Between the Demo and the Lakehouse

Spark 4.1 shipped Real-Time Mode (RTM) as a first-class trigger type for Structured Streaming. The pitch: swap Trigger.processingTime("10 seconds") for Trigger.realTime() and watch your latency drop from seconds to single-digit milliseconds. For Kafka-to-Kafka transformations in Scala, this works. The numbers are legitimate.

But most production streaming pipelines don't terminate at Kafka. They land data in a lakehouse — overwhelmingly Delta Lake or Iceberg. And here's the problem: RTM maintains a hardcoded list of supported sinks, and Delta Lake isn't on it.

The open-source Delta Lake connector implements only the addBatch method, which is the microbatch API. RTM requires a different interface entirely. Until someone rewrites the Delta sink to support the continuous processing model, every pipeline that writes to Delta Lake is stuck in microbatch mode regardless of what trigger you specify.

What the Benchmark Actually Shows

Raki Rahman ran a clean head-to-head: Spark 4.2 (preview) vs. Flink 1.16, both streaming JSON from Kafka into Delta Lake. Same producer, same message format, roughly 180,000 messages per second.

Metric	Flink 1.16	Spark 4.2	Spark 3.5
Avg latency	0.81 s	7.15 s	6.59 s
Latency pattern	Stable	Saw-tooth (2–15 s)	Saw-tooth
Throughput ceiling	Not hit	Not hit	Not hit

Flink beat Spark by nearly 9x on average latency. More telling: Spark 4.2 was actually slower than Spark 3.5 for this particular workload. The saw-tooth pattern — latency cycling between 2 and 15 seconds — is the signature of microbatch processing. Each cycle is a batch boundary where Spark commits a transaction to the Delta log.

Neither engine broke a sweat at 180K msg/s, so this isn't a throughput story. It's a latency story, and for lakehouse sinks, Flink owns it.

Where RTM Actually Delivers

Strip away the lakehouse requirement and RTM becomes genuinely interesting. The architecture differs fundamentally from microbatch: instead of discrete batches, it uses extended batch windows (defaulting to five minutes) that process records as they arrive, with simultaneous scheduling of all query stages and streaming shuffle for inter-stage data transfer.

The workloads where this matters today:

Kafka-to-Kafka enrichment — stateless transformations, filtering, routing
Fraud scoring — score a transaction and push the result to a downstream topic, all sub-10ms
Real-time feature serving — compute a feature and write it to Redis or another key-value store

Notice what these have in common: none of them touch a table format. The moment you need ACID transactions, partition management, or file compaction — the things that make a lakehouse a lakehouse — you're back in microbatch territory.

The Python Problem

Here's the constraint that'll stop most teams cold. RTM in Spark 4.1 supports Scala only, and only for stateless workloads. No Python. If your organization — like the majority of data engineering teams — writes PySpark, the feature doesn't exist for you yet. SPARK-53736 tracks Python support as future work, but there's no committed timeline.

This matters more than it sounds. Spark's value proposition for data engineering has always been "write Python, run it at scale." An optimization restricted to Scala stateless Kafka-to-Kafka pipelines serves a narrow slice of the user base. The teams most likely to benefit already have Flink in production.

Checkpoints and the Recovery Tax

RTM's extended batch windows create a subtle operational risk that rarely makes it into the launch blog posts. In microbatch mode, the framework checkpoints at every batch boundary — typically every few seconds. If the job dies, you replay from the last checkpoint. Recovery stays fast because you lost at most one batch of progress.

RTM defaults to five-minute windows. Kill the job at minute four and you replay four minutes of data. For a stream doing 180K msg/s, that's north of 40 million messages to reprocess. You can tune the window shorter, but Databricks' own documentation warns that shorter windows "may impact latency" — which undermines the whole point.

Flink takes a different approach with incremental checkpointing. State snapshots happen asynchronously without blocking the processing pipeline. Flink has checkpoint overhead too, but it doesn't correlate with processing latency the way Spark's batch boundaries do. You get fast recovery without paying for it in the hot path.

What Should You Actually Do?

If you're building a new streaming pipeline that lands in Delta Lake or Iceberg and latency matters, Flink is still the better engine. The 0.81s vs. 7.15s gap isn't subtle — it's the difference between "near real-time" and "eventually consistent with a lag you can feel in the dashboard."

If you already run Spark everywhere and your latency tolerance is 10–30 seconds, microbatch with a short trigger interval remains fine. It's battle-tested, works with every sink, and the operational tooling is mature. Don't let RTM marketing make you feel behind.

If you genuinely need sub-second latency and you're committed to Spark, RTM works — provided your sink is Kafka or a forEach adapter, you write Scala, and your transformation is stateless. That's a real but narrow set of constraints.

The 5ms number isn't wrong. It's just not the number that matters for the workload you're probably running.

#The Gap Between the Demo and the Lakehouse

#What the Benchmark Actually Shows

#Where RTM Actually Delivers

#The Python Problem

#Checkpoints and the Recovery Tax

#What Should You Actually Do?