AI/BI Genie

Monitor Databricks AI/BI Genie spaces in Monte Carlo for full agent observability

AI/BI Genie is Databricks' natural-language analytics interface: business users ask questions about their data in plain English, and Genie generates SQL, runs it against Unity Catalog, and returns an answer alongside the query it ran. Each Genie space is scoped to a set of tables. Monte Carlo treats a Genie space as a platform agent and surfaces each conversation turn — the question, the generated SQL, a sample of the results, and the final answer — for monitoring.

Setup differs from Agent Bricks and custom agents: because Genie keeps its conversation history behind an API rather than in a Unity Catalog table, Monte Carlo bridges that gap with a managed collector, explained below.

How Monte Carlo collects Genie traces

📘
Prerequisites
An existing Databricks integration in Monte Carlo with a SQL Warehouse connection. If you haven't set this up yet, see our Databricks SQL Warehouse guide. Because a SQL Warehouse connection is scoped to a single Databricks workspace, it must connect to the same workspace your Genie spaces live in — set up one integration per workspace you want to monitor.

A Genie space's conversation history is API-resident: it lives only behind the Genie Conversation API, with no table to query. So instead of asking you to turn on trace export, Monte Carlo installs a lightweight, fully managed collector job into your Databricks workspace. About once an hour, the collector reads new Genie conversations and merges them into a Unity Catalog table named genie_traces, in the catalog and schema you choose during setup, which Monte Carlo then reads through your SQL Warehouse connection like any other trace table.

You don't deploy, upgrade, or run the collector — Monte Carlo manages its full lifecycle. You only grant the permissions below. After you add a Genie space, the first collection runs on the collector's next scheduled run.

Permissions

Setting up Genie involves three things Monte Carlo does for you, each needing its own access: it installs the collector, the collector collects Genie data into a trace table, and Monte Carlo reads that table. In the common setup, all three roles are filled by one identity — the service principal your Databricks SQL Warehouse connection authenticates as in Monte Carlo. Using one dedicated, least-privilege service principal for it is the recommended setup; the rest of this section just calls it "the service principal."

Everything you need to grant, at a glance:

Grant	Where you grant it	Checked at setup?
Create/import notebooks in `/Shared/monte_carlo/genie_collector/…` (Monte Carlo creates the folder tree)	`CAN MANAGE` on the workspace folder	✅ Yes
Create jobs (run + delete come with ownership)	`Workspace access` entitlement	✅ Yes
`USE CATALOG`, `USE SCHEMA`, `CREATE TABLE` on the target catalog and schema	Unity Catalog `GRANT`	✅ Yes
`CAN MANAGE` on each Genie space you monitor	Genie space Share/Permissions dialog	✅ Yes — gate-checked at registration
`SELECT` + `MODIFY` on `genie_traces` (or its schema) — only if `genie_traces` is pre-created or owned by a different principal	Unity Catalog `GRANT`	❌ No — the collector's first run fails (permission error)

Only the last grant (which applies only when you pre-create the trace table) isn't checked at setup, so grant it up front. Each grant in detail:

1. Install the collector — checked at setup. For the principal that installs the collector:

Workspace folder access — CAN MANAGE on /Shared/monte_carlo/ (or the genie_collector subfolder), which Monte Carlo needs to create the collector's folder tree and import its notebook. Scope it to that folder; no workspace-wide grant is needed.
Jobs: Workspace access entitlement — to create the collector job (one per catalog and schema). This is a workspace-level entitlement and can't be scoped to a path. The principal owns the job it creates, so running and deleting it need nothing more, and the collector runs on serverless compute, so no cluster entitlement is required.

Grant both from the principal's entitlements and the relevant Permissions dialogs in Databricks — they aren't Unity Catalog GRANTs.

2. Collect Genie data — Unity Catalog write and Genie space access are both checked at setup. On the catalog and schema where the trace table will live, grant the service principal (which the collector runs as):

GRANT USE CATALOG ON CATALOG <catalog> TO `<service-principal>`;
GRANT USE SCHEMA ON SCHEMA <catalog>.<schema> TO `<service-principal>`;
GRANT CREATE TABLE ON SCHEMA <catalog>.<schema> TO `<service-principal>`;

This catalog and schema are the trace-table location you pick when adding the agent; both must already exist in Unity Catalog — Monte Carlo creates only the genie_traces table, never the catalog or schema, so pointing the registration at a catalog or schema that doesn't exist fails the setup gate. One collector serves every Genie space you register to the same connection, catalog, and schema, multiplexing them all into that single table — so a shared schema dedicated to Monte Carlo trace tables is the intended setup.

The collector creates and owns the genie_traces table on its first run, so it can write and read it with no extra grant. Letting it create the table is the simplest path; if you must pre-create it, see the note below.

Plus Genie space access — CAN MANAGE on each space you want to monitor, granted from the space's Share/Permissions dialog in Databricks. The collector lists conversations with include_all=true so it captures all users' conversations, not just the service principal's own; the Databricks Genie API only returns them with that flag, and the flag requires CAN MANAGE. With only CAN VIEW the collector still runs successfully but collects 0 rows — it sees just the service principal's own conversations (typically none), so the trace table stays silently empty with no error. CAN MANAGE lets the principal edit and delete the Genie space — heavier than read-only access — but it is the minimum scope the Databricks Genie API exposes for cross-user conversation access; there is no narrower read-all grant. The setup gate verifies it at registration, so a missing grant blocks setup with an inline error.

3. Read traces in Monte Carlo — not checked at setup. Monte Carlo reads genie_traces over your SQL Warehouse connection as the same service principal that owns it, so no extra grant is needed.

📘
If you pre-create the trace table
If genie_traces already exists and the service principal isn't its owner, grant it SELECT (to read) and MODIFY (to write), or transfer ownership:
GRANT SELECT, MODIFY ON TABLE <catalog>.<schema>.genie_traces TO `<service-principal>`;
A pre-created table must match the schema the collector creates automatically — a column mismatch fails the run (not a permission error). Letting the collector create and own the table is the recommended path, since it always gets the schema right; pre-create only if your governance requires the table to exist beforehand or be owned by a different principal.

🚧
"I connected but the trace table is empty"
An empty trace table almost always comes down to the Genie-space grant: the service principal needs CAN MANAGE (not just CAN VIEW) on the space. The collector lists conversations with include_all=true to capture every user's conversations, and the Databricks Genie API only honors that flag with CAN MANAGE. With only CAN VIEW, each run still succeeds and Status stays healthy — it just sees the service principal's own conversations (typically none), so the table stays silently empty with no error. The setup gate checks CAN MANAGE at registration, so this usually means the grant was reduced or removed afterward; re-grant CAN MANAGE and collection resumes on the next run (it picks up recent conversations within the collector's lookback window).
(If you pre-created genie_traces under a different owner, also confirm the service principal holds SELECT/MODIFY — see the note above. That case fails the run with a permission error rather than silently, and shows up in the Status column.)

Genie chat sharing (capturing results and answers)

If the workspace's Genie chat sharing setting is off, Monte Carlo's collector captures each turn's question and generated SQL, but not the result sample or final answer — the result fetch is denied (PERMISSION_DENIED — "User … does not own conversation …") and the turn appears question-only in Monte Carlo. That's because Genie ties each conversation's answer to the user who asked it and keeps the result private to them — so it isn't exposed to other users, including space managers like the collector, unless the conversation is shared.

To capture results and answers, a workspace admin needs to enable it in Databricks: username menu (top-right) → Previews → turn on Genie chat sharing. With it enabled, new conversations are shared as "Reviewable by space managers", which lets the collector read their results and answers.

This applies going forward only: conversations created before it was enabled stay question-only and can't be backfilled. Individual users can also mark a specific conversation Private, in which case it stays question-only even with the setting on.

Adding the agent in Monte Carlo

In Monte Carlo, navigate to Settings → Agent Observability
Click Add
Toggle the Agent Type to Platform Agent
Select a Databricks SQL Warehouse: Choose the warehouse Monte Carlo will use to read traces
Select an Agent: Pick Databricks Genie, then select your Genie space from the discovered list. Registering a space requires your service principal to hold CAN MANAGE on it (see Permissions) — if the grant is missing, the import returns an inline error, and if the space isn't in the list at all, the service principal has no access to it. Either way, granting CAN MANAGE makes the space both discoverable and registrable; grant it and retry
Choose the trace-table location: Select the catalog and schema where the collector creates its genie_traces table — searchable dropdowns over your warehouse's existing Unity Catalog catalogs and schemas, with the schema list scoped to the catalog you pick. Both must already exist; Monte Carlo creates only the table
Click Import to complete the connection

When you import, Monte Carlo runs a synchronous preflight check against the space, catalog, and schema you chose. If a required setup-enforced grant (install, Unity Catalog write, or CAN MANAGE on the Genie space) is missing, the import surfaces an inline error listing exactly what's missing — grant it in Databricks and click Import again. On success, Monte Carlo installs the collector and begins collecting on its schedule.

Once connected, you're ready to create Agent Monitors.

Verifying the connection

After import, open Settings → Agent Observability and find your Genie space in the list. The Status column shows a health badge — Completed, Error, Pending, Permission needed, Install error, or Installing — above a freshness line: a successful collection reads Completed with the date it last ran, a space added since the shared collector's last run reads Awaiting this agent's first collection run until its first collection, and a problem state (missing permissions, or a failed install or run) shows an inline message describing what to fix. Because the collector runs about once an hour, allow up to roughly an hour between a Genie conversation happening in Databricks and its traces appearing in Monte Carlo.

Limitations & operational notes

Collection runs on a schedule, not in real time — the collector polls Genie about once an hour, so traces are near-real-time, not instant.
Grants can be fixed after setup — if a required grant is revoked or added later, the collector pauses and resumes automatically on its next run once the grant is in place; you don't need to re-import. The one exception is the Genie-space grant: reducing it from CAN MANAGE to CAN VIEW produces no error, so the collector keeps running but silently collects nothing — watch the trace table, not just the Status column.
Removing a Genie agent — Deleting a Genie agent in Monte Carlo removes Monte Carlo's collector job and notebook from your workspace once no space is still registered to that catalog and schema; if other Genie spaces are still registered there, the collector keeps running for them. The genie_traces table is left in place — Monte Carlo never drops it; delete it yourself if you no longer need the collected data.

What you'll see

Monte Carlo lists each Genie space under its friendly space name (not the raw space ID), and automatically groups each conversation by its conversation ID — no configuration needed. Every turn surfaces the user's question, the SQL Genie generated, a bounded sample of the query results, and the final answer, with failed turns flagged so you can monitor reliability. Capturing the result sample and final answer requires the conversation to be shared with space managers — see the Genie chat sharing section above; otherwise turns show the question and generated SQL only.

Updated 6 days ago

Did this page help you?