AI/BI Genie

Monitor Databricks AI/BI Genie spaces in Monte Carlo for full agent observability

AI/BI Genie is Databricks' natural-language analytics interface: business users ask questions about their data in plain English, and Genie generates SQL, runs it against Unity Catalog, and returns an answer alongside the query it ran. Each Genie space is scoped to a set of tables. Monte Carlo treats a Genie space as a platform agent and surfaces each conversation turn β€” the question, the generated SQL, a sample of the results, and the final answer β€” for monitoring.

Setup differs from Agent Bricks and custom agents: because Genie keeps its conversation history behind an API rather than in a Unity Catalog table, Monte Carlo bridges that gap with a managed collector, explained below.

How Monte Carlo collects Genie traces

πŸ“˜

Prerequisites

An existing Databricks integration in Monte Carlo with a SQL Warehouse connection. If you haven't set this up yet, see our Databricks SQL Warehouse guide.

A Genie space's conversation history is API-resident: it lives only behind the Genie Conversation API, with no table to query. So instead of asking you to turn on trace export, Monte Carlo installs a lightweight, fully managed collector job into your Databricks workspace. About once an hour, the collector reads new Genie conversations and merges them into a Unity Catalog table named genie_traces, in the catalog and schema you choose during setup, which Monte Carlo then reads through your SQL Warehouse connection like any other trace table.

You don't deploy, upgrade, or run the collector β€” Monte Carlo manages its full lifecycle. You only grant the permissions below. After you add a Genie space, the first collection runs on the collector's next scheduled run.

Permissions

Setting up Genie involves three things Monte Carlo does for you, each needing its own access: it installs the collector, the collector collects Genie data into a trace table, and Monte Carlo reads that table. In the common setup, one dedicated, least-privilege service principal β€” the one attached to your SQL Warehouse connection β€” fills all three roles, and it's the recommended connection identity.

Monte Carlo checks the install and Unity Catalog write grants synchronously when you add the space (see Adding the agent); the read and Genie API grants are not checked at setup, so grant them up front to avoid a connection that shows no data.

1. Install the collector β€” checked at setup. For the principal that installs the collector:

  • Workspace import β€” permission to create and import notebooks in the folder Monte Carlo installs the collector into, /Shared/monte_carlo/genie_collector/…. A folder-scoped grant is enough; no workspace-wide write is needed.
  • Jobs: create and run (plus delete for teardown) β€” Monte Carlo creates one collector job per catalog and schema and triggers it on schedule. Databricks job creation can't be scoped to a path, so this is a workspace-level entitlement.

Grant both from the principal's entitlements and the relevant Permissions dialogs in Databricks β€” they aren't Unity Catalog GRANTs.

2. Collect Genie data β€” Unity Catalog write checked at setup; Genie read surfaces on the first run (a missing grant fails that run). On the catalog and schema where the trace table will live, grant the principal the collector runs as (by default, the same connection principal):

GRANT USE CATALOG ON CATALOG <catalog> TO `<service-principal>`;
GRANT USE SCHEMA ON SCHEMA <catalog>.<schema> TO `<service-principal>`;
GRANT CREATE TABLE ON SCHEMA <catalog>.<schema> TO `<service-principal>`;

The collector creates the genie_traces table on its first run and therefore owns it, so it can merge new conversations into it on later runs without any extra grant. You only need MODIFY if the table is pre-created or owned by a different principal β€” then grant that principal MODIFY ON SCHEMA <catalog>.<schema> (or transfer table ownership).

Plus Genie API read on each space you want to monitor β€” grant the run-as principal at least view access on the space, from its sharing/Permissions dialog in Databricks. This also lets the space appear in Monte Carlo's picker. The setup gate doesn't check it, but the collector calls the Genie API on its first run: if this grant is missing, that run fails and the error shows on the collector status badge. Grant it up front to avoid a failed first collection.

3. Read traces in Monte Carlo β€” not checked at setup. For your SQL Warehouse connection's principal, which reads the trace table:

GRANT SELECT ON SCHEMA <catalog>.<schema> TO `<service-principal>`;

Databricks does not include trace-table SELECT in ALL_PRIVILEGES, so it must be granted explicitly. The collector creates genie_traces on its first run; granting SELECT at the schema level (above) covers the table before it exists β€” or grant it on <catalog>.<schema>.genie_traces once it's there.

🚧

"I connected but see no data"

Neither the trace-table SELECT nor the Genie-space read is checked by the setup gate, and they fail differently:

  • Missing SELECT β€” the collector still installs and runs (status badge healthy) and writes genie_traces, but Monte Carlo can't read the table, so no data appears in Monte Carlo.
  • Missing Genie read β€” the collector's first run fails when it calls the Genie API, so the status badge shows an error.

Grant both up front to avoid either.

Adding the agent in Monte Carlo

  1. In Monte Carlo, navigate to Settings β†’ Agent Observability
  2. Click Add
  3. Toggle the Agent Type to Platform Agent
  4. Select a Databricks SQL Warehouse: Choose the warehouse Monte Carlo will use to read traces
  5. Select an Agent: Pick Databricks Genie, then select your Genie space from the discovered list (if it's not listed, see Permissions above)
  6. Choose the trace-table location: Enter the catalog and schema where Monte Carlo's collector should create the trace table
  7. Click Import to complete the connection

When you import, Monte Carlo runs a synchronous preflight check against the catalog and schema you chose. If a required setup-enforced grant (install, or Unity Catalog write) is missing, the import surfaces an inline error listing exactly what's missing β€” grant it in Databricks and click Import again. On success, Monte Carlo installs the collector and begins collecting on its schedule.

Once connected, you're ready to create Agent Monitors.

Verifying the connection

After import, open Settings β†’ Agent Observability and find your Genie space in the list β€” it shows a collector status badge that reflects the collector's health: healthy once the collector is installed and running, or an error state (with the underlying error and how to fix it) if a grant is still missing. Because the collector runs about once an hour, allow up to roughly an hour between a Genie conversation happening in Databricks and its traces appearing in Monte Carlo.

Limitations & operational notes

  • Collection runs on a schedule, not in real time β€” the collector polls Genie about once an hour, so traces are near-real-time, not instant.
  • Grants can be fixed after setup β€” if a required grant is revoked or added later, the collector pauses and resumes automatically on its next run once the grant is in place; you don't need to re-import.
  • Keep the connection credential current β€” the collector runs with the principal attached to your SQL Warehouse connection. If you authenticate with a personal access token (PAT), rotate it before it expires and avoid idle-revocation: a revoked or expired token stops collection until the connection is re-authenticated. A dedicated service principal with a managed credential avoids this.

What you'll see

Monte Carlo automatically groups each Genie conversation by its conversation ID β€” no configuration needed. Every turn surfaces the user's question, the SQL Genie generated, a bounded sample of the query results, and the final answer, with failed turns flagged so you can monitor reliability.