Microsoft FabricDataflow Gen2Cost OptimizationPower BI

Dataflow Gen2 Benchmarks: What They Actually Mean for Your Fabric Bill

June 6, 2026·6 min read

Microsoft Fabric Dataflow Gen2 benchmark comparison, cost and performance

A bulk copy job that ran for 1 hour 42 minutes on Dataflow Gen1 finished in 7 minutes 43 seconds on Gen2. An ELT pattern dropped from 2 hours 42 minutes to just under six. Those are Microsoft's own numbers, published in a benchmark post alongside the Build 2026 announcements, and the headline claim is an order-of-magnitude improvement in both performance and cost. The performance half is easy to read off a chart. The cost half deserves the closer look, because since late 2025 the two are coupled in a way they weren't before.

The four scenarios, side by side

Microsoft benchmarked four canonical workloads, each with a different Gen2 capability enabled, comparing execution time against Dataflow Gen1:

ScenarioWorkloadGen2 leverGen1Gen2 (CI/CD)
Copy dataBulk-load consolidated Parquet from ADLS Gen2 into a Lakehouse, no transformationsFast Copy1:42:180:07:43
Heavy data shapingNon-foldable filtering, derivations and cleansing before loadModern Evaluator1:13:440:46:15
Combine filesCombine and transform partitioned Parquet files in parallelPartitioned compute1:40:570:04:48
ELT patternStage once, then run downstream referenced transformationsStaging + Fast Copy2:42:440:05:53

Divide the times and you get roughly 13x, 1.6x, 21x and 28x. The distribution is the interesting part. The three spectacular multipliers come from scenarios where the win is architectural: a copy-optimized ingestion backend, parallel evaluation across file partitions, or materializing data once instead of re-reading the source for every downstream query. The one scenario where the engine genuinely has to chew through non-foldable transformation logic improves by about a third. That is still a real gain, attributed to the Modern Query Evaluation Service that targets row-by-row operations, but it shows where the ceiling sits: if a dataflow is slow because of dense per-row M logic, expect meaningfully faster, not 20x faster.

Two caveats. These are vendor benchmarks, and the post publishes execution times but not data volumes or the CU consumption of the individual runs. Treat the multipliers as directional until you have reproduced something similar on your own workload.

Why faster now also means cheaper

For most of Dataflow's history, speed and cost were separate conversations: a faster refresh was nicer to wait for, and the bill was whatever it was. The benchmark post connects the two explicitly. Following the Dataflow Gen2 pricing improvements Microsoft released at FabCon Europe in September 2025, the same gains that reduce refresh time also reduce the CU consumption of the workload. Finishing the same transformation faster on Gen2 (CI/CD) now means paying less for it relative to Gen1, not just waiting less.

On a shared Fabric capacity the effect compounds. Long-running dataflow refreshes occupy capacity units that reports, notebooks and every other workload draw from, and the practical trigger for moving up an F-SKU is rarely one expensive job; it is the pile-up of background refreshes around month-end, precisely when the finance team needs the capacity for reporting. A two-and-a-half-hour ELT window shrinking to six minutes does more than trim a cost line; it hands capacity back at the point in the month when reporting needs it most.

Four levers, not a faster engine

The structural change in Gen2 is that performance no longer comes from a single faster engine. The post is explicit about this and recommends a progression instead of switching everything on at once:

  • Defaults first. Moving existing Power Query logic to Gen2 (CI/CD) with default settings already brings a meaningful improvement in most cases, largely thanks to the modern M evaluation engine.
  • Modern Evaluator when transformation work itself is the bottleneck: non-foldable or partially foldable queries and row-by-row operations. It is enabled by default in new Gen2 (CI/CD) items, with a fallback to the standard engine if a specific query or connector misbehaves.
  • Fast Copy when ingestion throughput is the constraint. It loads directly to a Lakehouse destination and works best when in-flight shaping stays light (column selection, renames, type changes); heavier transforms belong in a follow-on query against staged data.
  • Partitioned compute (in preview, for Gen2 with CI/CD) when the source can be partitioned, typically many files combined through the Combine Files experience. The partition key column has to stay in the query for the engine to parallelize the work.
  • Staging when ingestion and transformation start competing inside a single query: materialize once, reference downstream, and let multiple outputs build on the same foundation.

The skill this asks of a BI team shifts accordingly. Less time tuning individual queries, more time recognizing which of the four scenarios a workload actually is, and pulling that one lever.

Where a typical finance pipeline lands

Map the scenarios onto a mid-market finance stack and the assignment is fairly clean. A month-end folder of per-entity ledger extracts or bank statement files is a Combine Files pattern, which makes it partitioned-compute territory once the feature leaves preview. Landing a full transaction history or vendor master into a Lakehouse before shaping is Fast Copy plus staging. Invoice processing, with its per-line parsing and field derivation, is exactly the transformation-heavy, non-foldable profile the Modern Evaluator targets, and exactly where the benchmark shows the smallest multiplier.

That last category is the one to budget realistic expectations for. At Morgan Stanley I built a global expense dashboard that automated the processing of 600+ monthly invoices arriving as PDF and Excel, and flows like that are almost entirely per-row work; nothing folds, and little of it parallelizes neatly. A third off the refresh time is what the benchmark suggests for this shape of problem. A third is worth having, but it will not turn an overnight job into a coffee break.

One more detail matters for governed environments: the headline features ship in the CI/CD variant of Dataflow Gen2. That happens to be the variant a finance data platform should be on anyway; my own lakehouse work at Syngenta runs BI assets through CI/CD precisely because deployments into reporting infrastructure need the same discipline as deployments into anything else.

Before you migrate off Gen1

The post's recommendation is unambiguous: Dataflow Gen2 (CI/CD) writing to a Lakehouse, with Power BI Direct Lake on top, is positioned as the best setup, and Gen1 or Semantic-Model-only architectures as the thing to upgrade from. For teams with dozens of Gen1 dataflows accumulated over years, a few practical notes temper the enthusiasm. Partitioned compute is still in preview. The Modern Evaluator is on by default but has a documented fallback, which suggests some queries and connectors can still run into compatibility issues. And Fast Copy expects light ingestion-time transformations, so dataflows that mix heavy shaping into the load step will need restructuring into a land-then-transform pattern to benefit fully.

The full scenario descriptions and the pricing discussion are in Benchmarking Dataflow Gen2: Faster data transformation at lower cost on the Fabric Updates Blog, along with links to the Gen2 pricing documentation.

A measured way to act on this

Resist the mass migration. Pick one representative dataflow from each category you actually run: one big copy, one shaping-heavy flow, one multi-file combine. Rebuild them in Gen2 (CI/CD), keep the Power Query logic unchanged, and compare two things against the Gen1 baseline: wall-clock refresh time and CU consumption per refresh from the capacity metrics. If your numbers rhyme with Microsoft's, the migration case writes itself and you will have your own benchmark to show for it. If they don't, you found out on three dataflows instead of fifty.

Facing a similar challenge?

📅 Book a Free Call