Multi-turn agent interactions can now be viewed as a single, chronological conversation in Trace Explorer. Traces sharing a conversation ID are automatically grouped, displaying user messages and agent responses in order with timestamps, role labels, and turn numbers.

Each turn links directly to its underlying trace, and clicking any message opens a detail panel with the full content. This makes it easy to follow the complete back-and-forth of an agent conversation in one place and drill into any individual trace when you need more detail.

You can now track a metric over time without triggering alerts.

What's New:

  1. Track metrics before turning on alerts Metric, Custom SQL, and Comparison monitors now support a Track mode. When defining alert conditions, you can choose "Track" to run the monitor and collect the metric over time without setting thresholds or triggering notifications. NOTE: Save the monitor in “Enabled” status (not as a draft) to start it running.
  1. Track metrics on your custom dashboard You can now add any chart to a custom dashboard using the dashboard icon on the chart (top right). This lets you keep an eye on the signals you care about in one place. Charts can be added from:
    1. Monitor results
    2. Asset results
    3. Data profiling results (Yes, now you can observe profiling metrics over time)

Why this matters

In some cases you want to observe a signal (null %, row count, any custom metric) before deciding what should actually alert. Track mode lets you build confidence in the data before operationalizing it.

With Track mode and dashboards you can:

  • Observe metrics before setting thresholds
  • Validate assumptions without creating alert noise
  • Keep a persistent place to check data health
  • Turn tracked signals into alerts when you’re ready

The troubleshooting agent now presents supporting evidence in a structured timeline format, including PR diffs, making it easier to follow the reasoning behind each recommendation.

You can also point to any specific piece of evidence and ask the agent to reconsider it, with the option to include additional context or clarification. This gives you more control over the troubleshooting process and helps you arrive at the answer you need faster.


Agent evaluation monitors and metric monitor prompt configurations now support custom LLM model names.

You can type any model name directly into the model selector instead of being limited to a predefined list, making it easy to use the latest models as soon as they are available.

The dropdown still shows all predefined options for convenience, with custom values clearly labeled. This gives your team the flexibility to route LLM requests to any model identifier your environment requires.

You can now create production-ready LLM-as-a-judge evaluations by simply describing what you want to measure. Type a short description of the dimension you care about, hit Generate, and get a complete eval prompt ready for production.

Starter templates are included for common evaluation dimensions like answer relevance, helpfulness, task completion, language match, clarity, prompt adherence, and semantic similarity. Advanced controls let you fine-tune scoring criteria and strictness levels to match your specific requirements.


You can now track and manage individual breached rows from Custom SQL and Validation monitors directly in Monte Carlo. When a primary key is configured on a monitor, breached rows are tracked across runs in a new Exceptions tab. From there, you can assign an owner, set a resolution status, add comments, take bulk actions, and track how long each exception has been open.

Learn more here: https://docs.getmontecarlo.com/docs/exception-management

Agent assets now include out-of-the-box dashboards showing trace volume, latency distributions across P50/P95/P99, token consumption trends, and error rates. All with automatic period-over-period comparisons. No configuration is required; connect your OpenTelemetry traces and the views are ready.

Whether you need to spot spikes in token usage, catch latency drifting upward, or confirm your agents are behaving as expected, these dashboards give your team immediate visibility from day one with a natural path to production-grade alerting as your agents mature.

You can now define FireHydrant Incident Tags on any FireHydrant notification channel within an Audience. Configure key/value or key-only tags, and they are automatically included on every alert sent through that channel.

Incident tags flow through automatically, ensuring incidents arrive in FireHydrant with the right routing and categorization without any extra manual steps.

Learn more here: https://docs.getmontecarlo.com/docs/firehydrant#firehydrant-incident-tags


Agent monitors can now be cloned: duplicate an existing monitor's configuration and point it at a different span.

When you're monitoring multiple spans with similar setups, cloning saves you from repetitive configuration and lets you scale monitor coverage across your agents faster.