Agent-Authored Power BI Reports: What to Delegate, What to Guard

A report page with four KPI cards, a few slicers and a detail table costs an hour of clicking in Power BI Desktop, and almost none of that hour requires judgment. Microsoft's latest preview hands that hour to an agent. The Power BI authoring plugin, published in Skills for Fabric (a first-party catalog of agent skills for Microsoft Fabric, optimized for GitHub Copilot CLI), lets an AI agent author report definitions from a natural-language brief: it writes schema-correct PBIR files, reloads Power BI Desktop, captures screenshots of the rendered pages, and iterates until the result matches what was asked.
For a finance BI team, the interesting question is not whether the demos work. It is where the line runs between the work you hand over and the work you keep. The preview is broad enough to cover both sides of that line, which is why it deserves a careful read before it gets anywhere near a production workspace.
What the preview actually does
The announcement shows three capability groups. An agent can create a report from scratch: the example prompt describes two pages in a single paragraph (KPI cards for named metrics, slicers, a detail table, consistent branding on both pages) and the agent builds them. It can modify an existing report against a reference image and a logo, restyling the pages to match. And it can modernize a dated report, where a companion skill called powerbi-report-design first produces a structured design debrief (best-practice principles for that report) which is then handed over for implementation. The same plugin family also connects to the Modeling MCP server and a semantic-model authoring skill, so an agent can in principle run the whole chain from model to report.
Two mechanics matter more than the gallery of demos. The agent writes PBIR, the file-based report definition format used by PBIP projects, instead of operating a canvas. And a Desktop bridge lets it reload your already-open Power BI Desktop instance and screenshot the latest pages, so it inspects its own output and corrects course rather than guessing whether a visual rendered correctly.
A report has layers; the agent verifies one of them
The screenshot loop is a genuine step forward for the presentation layer, because the agent finally sees what a user would see. What it verifies, though, is rendering rather than truth. Picture a hypothetical variance bridge where one measure has a flipped sign: it will render beautifully, pass every visual inspection, and mislead everyone who reads it. Whether month-end numbers reconcile to the ledger is decided in the semantic model and the data behind it, and no screenshot can tell the difference.
A report can look perfect and be wrong.
What to delegate
Layout scaffolding is the obvious candidate. Building pages from a written spec, aligning visuals, applying a brand theme from a reference image, adding navigation: this is exactly the work the demos show, all of it is visible on the canvas, and a mistake is embarrassing rather than dangerous.
Modernization is the second candidate, and probably the more valuable one. A typical finance team keeps dozens of reports built years ago alive; they still answer their questions but look their age, and a redesign sprint for them will never be funded. An agent that first produces a reviewable design debrief and then implements it changes the economics of that backlog.
The global expense dashboard I built at Morgan Stanley served 50+ users, with automated variance analysis over 600+ monthly invoices (PDF and Excel) feeding it. The value lived in the variance logic and the data processing behind it; the page layout was simply the price of delivering that value. In hindsight, that is precisely the kind of work I would have handed to a tool like this.
What to guard
The semantic model comes first. The same ecosystem that authors reports can author models through the Modeling MCP server, and that is the point where a finance team should slow down, because the model is where the numbers are made. Generated measures deserve the same treatment as generated code: a human reads every line before it merges, or it does not merge. For core financial logic (allocations, FX, variance definitions), keeping authorship human and delegating only the documentation is a defensible choice.
Row-level security belongs on the guarded list for a different reason: its failure mode is silent. A report with broken RLS looks identical and works normally, while showing someone rows they were never meant to see. Access rules should not be written by a tool whose verification loop is a screenshot.
Publishing is the third gate. The preview's scope runs from design through publishing to Fabric, which is all the more reason to keep the last step behind a human approval, the same way a deployment pipeline holds a production release for sign-off.
A working split
None of this is exotic. It is the same review discipline finance applies everywhere else, pointed at a new tool.
| Task | Delegate? | Why |
|---|---|---|
| Page scaffolding from a spec | Yes | Low stakes, fully visible, easy to reject |
| Theming and branding from a reference image | Yes | Cosmetic; errors are obvious on sight |
| Modernizing legacy report layouts | Yes, with review | The design debrief can be approved before implementation |
| New or changed DAX measures | Guard | A wrong number renders just as nicely as a right one |
| Semantic model authoring | Guard | This is where the figures are made; review it like code |
| RLS and access rules | Guard | Failures are silent and invisible in a screenshot |
| Publishing to a production workspace | Guard | Keep a human sign-off, as with any deployment |
Files are what make this governable
The detail that makes the guarded split workable is that the agent's output is files. PBIR definitions inside a PBIP project are text, which means branches, diffs and pull requests apply to them. Part of my current work at Syngenta is CI/CD for BI assets, and that is the frame this preview fits into: agent-written report definitions can sit in a branch and pass a review before any workspace ever sees them, exactly like human-written ones.
Getting started is two commands in a supported AI client:
The full walkthrough, with the example prompts and the before/after demos, is in AI-Powered Power BI reporting: From design to deployment with agent skills on the Power BI updates blog. It is a preview, and Microsoft says it is shipping improvements quickly, so expect the details above to move.
A first pilot worth running
Imagine a controller who owns a dozen aging management reports sitting on a stable, already-reviewed semantic model. Hand the agent one of them, on a copy, with a modernization brief. Let it restructure the layout and navigation, then diff the PBIR output against the original and review the changes the way you would review a colleague's pull request. The model and the RLS stay untouched, and publishing stays manual. The worst possible outcome is a rejected branch and an hour lost; the best is a refreshed report estate and a calibrated sense of where the tool's judgment ends and yours has to begin.
Facing a similar challenge?
📅 Book a Free Call