Agent Trajectory Monitor

Overview

Agent trajectory monitors check how an agent's spans occur and relate β€” alerting when a span happens too often or too rarely, or when spans appear in an unexpected relationship. Unlike metric and evaluation monitors, they do not use alert_conditions; the alerting logic lives in a single agent_span_alert_condition, evaluated over a time_filter window.

πŸ“˜

Reference scope

This page covers MaC YAML configuration. For how trajectory monitoring works, see Agent Trajectory Monitors.

MaC key: agent_trajectory. Trajectory monitors operate at the agent/trace grain β€” they are always span-grain (no trace/conversation aggregation), and span-level refinements (workflow/task/span_name) belong in the alert condition, not in agent_span_filters.

πŸ“˜

Author by export. The agent_span_alert_condition tree is intricate. The most reliable way to get its exact shape is to build the monitor in the Monte Carlo UI and export it with montecarlo monitors export, then manage the exported YAML.

Quick Start

montecarlo:
  agent_trajectory:
    - name: retrieval_workflow_occurs
      description: Alert when the retrieval workflow runs more than once
      agent: my-otel-agent # the agent's service_name
      agent_span_alert_condition:
        operator: OR
        conditions:
          - type: SPAN_OCCURRENCE
            span_field:
              workflow:
                literal: retrieval
            predicate:
              name: occurs
            comparison_operator: MORE_THAN
            count: 1
      time_filter:
        time_field:
          field: ingest_ts
        lookback_in_hrs: 10
      schedule:
        type: fixed
        interval_minutes: 60
        start_time: "2025-01-01T00:00:00+00:00"
      domains:
        - my-domain

Configuration

agent β€” the agent to monitor

string Β· required

The agent whose spans the monitor reads. Agent monitors name an agent instead of a table; the warehouse source is derived from it. The value takes one of two forms, told apart by its shape:

  • Warehouse platform agents (e.g. Snowflake Cortex, Databricks) β€” a <database>:<schema>.<name> reference, where <name> is the agent's display name when it has one (otherwise its underlying identifier). No trace_table.
  • OpenTelemetry agents β€” the agent's bare service_name. By default the OpenTelemetry trace store is resolved automatically; set trace_table only when the spans live in a specific warehouse table.
agent: my-otel-agent
trace_table β€” warehouse trace table for an OTel agent

string Β· optional

<database>:<schema>.<table> naming the warehouse trace table that holds an OpenTelemetry agent's spans. Required when the spans live in your warehouse, or when the warehouse holds more than one OpenTelemetry trace table. Omit it for platform agents and for agents resolved from the default OpenTelemetry trace store.

trace_table: "ingest:opentelemetry.traces"
agent_span_alert_condition β€” the trajectory alerting logic

object Β· required

Defines what to alert on. A set of span conditions combined by a boolean operator (AND / OR). Each condition is one of:

  • SPAN_OCCURRENCE β€” a span (identified by span_field) occurs more/less/exactly a count of times. Fields: span_field, predicate, comparison_operator (MORE_THAN Β· LESS_THAN Β· EXACTLY), count.
  • SPAN_RELATION β€” a span relates to one or more other spans (related_span_fields).

span_field targets a span by workflow, task, and/or span_name (each a literal), following the workflow β†’ task β†’ span_name hierarchy.

agent_span_alert_condition:
  operator: OR
  conditions:
    - type: SPAN_OCCURRENCE
      span_field:
        workflow:
          literal: retrieval
      predicate:
        name: occurs
      comparison_operator: MORE_THAN
      count: 1
time_filter β€” evaluation window

object Β· required

The lookback window the condition is evaluated over. time_field must be the ingest_ts field; lookback_in_hrs must be at least 1.

time_filter:
  time_field:
    field: ingest_ts
  lookback_in_hrs: 10
filters β€” additional span filtering

object Β· optional

A predicate tree (FilterGroup) that further restricts which spans are considered, beyond the agent and the alert condition.

agent_span_filters β€” agent refinement

array of objects Β· optional

For trajectory monitors only the agent dimension is meaningful here; span-level refinements (workflow/task/span_name) are rejected β€” express those in agent_span_alert_condition instead. The agent is already set by agent, so this is rarely needed.

schedule β€” when the monitor runs

object Β· required

type (fixed or manual), interval_minutes, start_time, timezone. Dynamic schedules are not supported for trajectory monitors.

schedule:
  type: fixed
  interval_minutes: 60
  start_time: "2025-01-01T00:00:00+00:00"
Common fields

The shared rule-monitor envelope:

  • name string Β· optional β€” unique identifier (auto-generated if omitted); renaming creates a new monitor.
  • description string Β· optional β€” max 512 characters.
  • warehouse string Β· optional β€” UUID or name; overrides the montecarlo.yml default.
  • connection_name string Β· optional β€” query engine to use within the warehouse.
  • timeout integer Β· optional β€” query timeout in seconds.
  • notes string Β· optional β€” shown in the UI, not in notifications.
  • audiences / failure_audiences array of strings Β· optional β€” notification channels.
  • priority enum Β· optional β€” P1–P5.
  • event_rollup_count integer Β· optional β€” minimum 2; roll up repeated breaches into one incident.
  • tags array of objects Β· optional β€” name (required) + value (optional).
  • data_quality_dimension enum Β· optional β€” ACCURACY Β· COMPLETENESS Β· CONSISTENCY Β· TIMELINESS Β· UNIQUENESS Β· VALIDITY.
  • domains array of strings Β· optional (required on accounts created after January 2025).
Deprecated fields
FieldUse instead
resourcewarehouse
domain_uuidsdomains
labelsaudiences

Examples

OpenTelemetry agent β€” workflow occurrence

Alert when the retrieval workflow span occurs more than once in the last 10 hours.

montecarlo:
  agent_trajectory:
    - name: retrieval_workflow_occurs
      description: Alert when the retrieval workflow runs more than once
      agent: my-otel-agent
      agent_span_alert_condition:
        operator: OR
        conditions:
          - type: SPAN_OCCURRENCE
            span_field:
              workflow:
                literal: retrieval
            predicate:
              name: occurs
            comparison_operator: MORE_THAN
            count: 1
      time_filter:
        time_field:
          field: ingest_ts
        lookback_in_hrs: 10
      schedule:
        type: fixed
        interval_minutes: 60
        start_time: "2025-01-01T00:00:00+00:00"
      domains:
        - my-domain

Platform agent β€” trajectory on a Snowflake/Databricks agent

The same monitor against a warehouse platform agent, referenced by its <database>:<schema>.<name> identity (no trace_table).

montecarlo:
  agent_trajectory:
    - name: cortex_tool_call_frequency
      description: Alert when the tool is called more than 10 times
      agent: "ANALYTICS:AGENTS.support_cortex_agent"
      agent_span_alert_condition:
        operator: OR
        conditions:
          - type: SPAN_OCCURRENCE
            span_field:
              task:
                literal: call_tool
            predicate:
              name: occurs
            comparison_operator: MORE_THAN
            count: 10
      time_filter:
        time_field:
          field: ingest_ts
        lookback_in_hrs: 24
      schedule:
        type: fixed
        interval_minutes: 120
        start_time: "2025-01-01T00:00:00+00:00"
      domains:
        - my-domain

Troubleshooting

Agent source

  • Omitting agent. Every agent monitor requires an agent. A monitor without one fails validation.
  • Platform reference not found / ambiguous. A <database>:<schema>.<name> reference must match exactly one registered agent in the target account and warehouse. A display name shared by two agents in a schema is ambiguous β€” use the underlying endpoint identity. Cross-account: register the agent first.
  • Setting trace_table for a default-store agent. Redundant and rejected when the OpenTelemetry trace store resolves automatically. Omit it.

Trajectory specifics

  • Using alert_conditions. Trajectory monitors have no alert_conditions field β€” the logic goes in agent_span_alert_condition.
  • Span-level agent_span_filters. workflow/task/span_name refinements are rejected here; express them inside agent_span_alert_condition.
  • time_filter field other than ingest_ts. The time_field must be ingest_ts, and lookback_in_hrs must be at least 1.
  • A dynamic schedule. Trajectory monitors don't support schedule.type: dynamic.
  • Span-field hierarchy. When targeting a span_name, its task and workflow must also be set; when targeting a task, its workflow must be set.