Agent Metric Monitors

Agent Metric Monitors alert on unexpected changes in agent performance metrics like latency spikes, token usage exceeding budgets, and error rate thresholds. Monitor the operational health and stability of your agents to spot reliability issues early — even when outputs appear correct on the surface. They function similarly to Metric Monitors but operate on spans within agent traces.

Creating Agent Metric Monitors

Agent metric monitors can be created by navigating to Add monitor → Agent metric.

Choose data

📘
If your agent does not appear for selection in the Choose data → Agent list, Monte Carlo has not yet detected telemetry flowing into your warehouse. Verify that your agent is properly instrumented and that you've configured the agent trace table from the Agent Observability settings page.

Agent: Select an agent to monitor. The agent appears here once telemetry is flowing into your warehouse.
Monitor: Select which spans within the agent trace you want the monitor to track.
- Entire trace — monitor metrics aggregated across all spans in the trace
- Specific spans — monitor metrics for individual spans
Optional filters: Refine which traces or spans are included by filtering on:
- Model name
- Latency or token usage
- Metadata or other custom attributes
Group data: Choose whether to bucket the data hourly or daily.
Segment data: Select up to 5 fields to segment the data by, or compose one with a SQL expression. When segmenting, the monitor will track metrics grouped by the values in that field or SQL expression.

If multiple fields are selected, the monitor will calculate metrics grouped by each distinct combination of values from those multiple fields.

Define alert conditions

Alert conditions determine when Monte Carlo should generate a new Alert based on the metrics you've selected.

To define a condition, choose the metric and field you want to monitor (for example, latency or token usage), and specify the operator and threshold that should trigger an Alert. You can use manual thresholds or automated machine learning-based thresholds.

Define schedule

Select when the monitor should run. Agent metric monitors run according to the schedule defined below.

On a schedule: Input a regular, periodic schedule. Options for handling daylight savings are available in the advanced dropdown.
Manual trigger: the monitor is run manually from the monitor details page using the Run button, or programmatically via the runMonitor API call.

Send notifications

Alerts can be routed to all existing notification channels already configured in Monte Carlo, so they fit naturally into your existing incident response workflows.

Select which audiences should receive notifications when an evaluation-based alert is triggered.

Define Notes

Text in the Notes section will be included directly in Alert notifications. The "Show notes tips" dropdown includes details on how to @mention an individual or team if you are sending notifications to Slack.

Notes support rich-text formatting, including bold, italic, underline, strike-through, lists, links, and code blocks. Rich-text channels display these styles, while text-only channels show a plain-text equivalent.

Monitor properties can be dynamically inserted into Notes through variables. Supported variables include Created by, Last updated at, Last updated by, Priority, and Tags.

Additional settings exist for customizing the description of the monitor, pre-setting a priority on any Alerts generated by the monitor, or for turning off failure notifications.

Updated 20 days ago

What’s Next

Agent Evaluation Monitors