Metric Monitor
Track field-level and table-level metrics over time with anomaly detection or fixed thresholds.
Overview
Track field-level and table-level metrics β null rates, row counts, distributions, and more β and get alerted when values breach thresholds or deviate from learned baselines. The metric monitor handles time-series data quality on a single table, with optional segmentation and custom metric expressions.
Reference scopeThis page covers MaC YAML configuration. For how metric monitors work and the full metrics list, see Metric Monitors and Available Metrics.
MaC key: metric. Replaces the legacy field_health monitor (blocked from new creation).
Quick Start
montecarlo:
metric:
- name: orders_row_count
description: Alert on unexpected row count changes in the orders table
data_source:
table: my_database:my_schema.orders
aggregate_time_field: created_at
aggregate_by: day
alert_conditions:
- metric: ROW_COUNT_CHANGE
operator: AUTO
schedule:
type: fixed
interval_minutes: 1440
domains:
- my-domainCreate interactively with create_or_update_metric_monitor(dry_run=True) via the Monte Carlo MCP server.
Configuration
string Β· required
Displayed in the Monte Carlo UI and in incident notifications. Max 512 characters.
description: Alert on unexpected row count changes in the orders tableobject Β· required
Provide either table or sql, not both.
data_source:
table: analytics:core.fct_ordersProperties
table β fully qualified table name
string Β· optional Β· one of table or sql required
Format: database:schema.table. Mutually exclusive with sql.
data_source:
table: analytics:core.fct_orderssql β custom SQL query
string Β· optional Β· one of table or sql required
Custom SQL query that returns the dataset to monitor. Mutually exclusive with table. Not compatible with use_partition_clause.
data_source:
sql: "SELECT * FROM analytics.core.fct_orders WHERE region = 'NA'"transforms β AI-powered field transforms
array of objects Β· optional
AI-powered field transforms applied to the data source before metric computation.
| Property | Type | Required | Description |
|---|---|---|---|
function | string | yes | Transform function name (e.g., classification, extraction) |
field | string | no | Column to apply the transform to |
alias | string | no | Output alias for the transformed column |
sql_expression | string | no | SQL expression defining the transform |
prompt | string | no | LLM prompt for AI-powered transforms. Not supported by classification (which uses categories instead) |
categories | array of objects | no | Category definitions for classification transforms. Each entry has a required label, optional description, and optional examples (array of strings) |
model_connection_id | string | no | Connection ID for the AI model used by the transform |
model_name | string | no | Name of the AI model to use |
output_type | string | no | Expected output data type |
field_config_list | array of objects | no | Field configuration list for multi-field transforms |
field_value_range | object | no | Value range constraints with lower_bound and upper_bound |
id | string | no | Unique identifier for this transform instance |
data_source:
table: analytics:core.fct_orders
transforms:
- function: classification
field: description
alias: description_category
categories:
- label: electronics
description: Electronic devices
- label: clothing
description: Apparel itemsarray of objects Β· required
Each entry defines a metric to track and the condition that triggers an alert. Provide either metric (built-in) or custom_metric, not both. Multiple conditions are allowed. Supported alert condition types: threshold (default), noop.
Available operators: AUTO Β· AUTO_HIGH Β· AUTO_LOW Β· GT Β· GTE Β· LT Β· LTE Β· EQ Β· NEQ Β· INSIDE_RANGE Β· OUTSIDE_RANGE Β· NOOP
Threshold types:
- Static β fixed numeric value via
threshold_valuewith an explicit operator (GT, LT, etc.). - Range β
lower_thresholdandupper_thresholdwithINSIDE_RANGEorOUTSIDE_RANGE. - Anomaly detection (AUTO) β ML-based.
AUTOcatches both high and low deviations.AUTO_HIGH/AUTO_LOWcatch one-sided deviations. Control aggressiveness with the monitor-levelsensitivityfield.
Pipeline metric operator restrictionPipeline metrics (
ROW_COUNT_CHANGE,TIME_SINCE_LAST_ROW_COUNT_CHANGE,RELATIVE_ROW_COUNT) only supportAUTO. Using explicit operators (GT, LT, etc.) on these metrics produces an error.
alert_conditions:
- metric: ROW_COUNT_CHANGE
operator: AUTO
- metric: NULL_RATE
operator: GT
threshold_value: 0.05
fields:
- EMAILProperties
metric β built-in metric name
string Β· optional Β· one of metric or custom_metric required
Built-in metric name (e.g., ROW_COUNT_CHANGE, NULL_RATE, NUMERIC_MEAN). See Available Metrics. Mutually exclusive with custom_metric.
alert_conditions:
- metric: NULL_RATEcustom_metric β custom SQL-based metric
object Β· optional Β· one of metric or custom_metric required
Mutually exclusive with metric.
| Property | Type | Required | Description |
|---|---|---|---|
uuid | string | no | UUID of an existing custom metric to reuse |
display_name | string | yes | Name for the metric |
sql_expression | string | yes | SQL expression that evaluates to a single numeric value (e.g., SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END)) |
alert_conditions:
- custom_metric:
display_name: Average Order Value
sql_expression: "AVG(order_total)"
operator: AUTOfields β columns to compute the metric on
array of strings Β· optional Β· required for field-level metrics like NULL_COUNT
Not allowed for table-level metrics (e.g., ROW_COUNT_CHANGE).
alert_conditions:
- metric: NULL_RATE
fields:
- EMAIL
- PHONE_NUMBERfield_pattern β pattern-based field selection
object Β· optional
Selects fields dynamically by name pattern instead of listing them explicitly. Use instead of fields when column names follow conventions.
| Property | Type | Required | Description |
|---|---|---|---|
operator | string | yes | CONTAINING Β· ENDING_WITH Β· MATCHING Β· STARTING_WITH |
value | string | yes | Pattern string to match against field names |
case_sensitive | boolean | no | Whether the match is case-sensitive. Default: false |
field_type | enum | no | Restrict matching to a specific field type: BOOLEAN Β· DATE Β· NUMERIC Β· TEXT Β· TIME Β· TIME_OF_DAY |
alert_conditions:
- metric: NULL_RATE
operator: GT
threshold_value: 0.01
field_pattern:
operator: ENDING_WITH
value: _id
field_type: NUMERICoperator β comparison operator
enum Β· optional
Accepted values: AUTO Β· AUTO_HIGH Β· AUTO_LOW Β· GT Β· GTE Β· LT Β· LTE Β· EQ Β· NEQ Β· INSIDE_RANGE Β· OUTSIDE_RANGE Β· NOOP
AUTO uses ML anomaly detection (both high and low deviations). AUTO_HIGH/AUTO_LOW detect one-sided deviations. NOOP collects data without alerting. Explicit operators (GT, LT, etc.) require threshold_value. INSIDE_RANGE/OUTSIDE_RANGE require lower_threshold and upper_threshold.
alert_conditions:
- metric: NULL_RATE
operator: GT
threshold_value: 0.05threshold_value β static threshold for comparison
number Β· optional Β· required when using explicit operators like GT, LT, etc.
Ignored when operator is AUTO, AUTO_HIGH, or AUTO_LOW.
alert_conditions:
- metric: NUMERIC_MEAN
operator: LT
threshold_value: 100lower_threshold β lower bound for range operators
number Β· optional Β· required for INSIDE_RANGE / OUTSIDE_RANGE
Used with INSIDE_RANGE or OUTSIDE_RANGE operators.
alert_conditions:
- metric: NUMERIC_MEAN
operator: OUTSIDE_RANGE
lower_threshold: 10
upper_threshold: 1000upper_threshold β upper bound for range operators
number Β· optional Β· required for INSIDE_RANGE / OUTSIDE_RANGE
Used with INSIDE_RANGE or OUTSIDE_RANGE operators.
alert_conditions:
- metric: NUMERIC_MEAN
operator: INSIDE_RANGE
lower_threshold: 0
upper_threshold: 100type β alert condition type
string Β· optional Β· default: threshold
Accepted values: threshold Β· noop
Use noop to collect data without alerting. When type: noop, you must set operator: NOOP β the CLI rejects a noop condition with no operator.
alert_conditions:
- metric: NULL_RATE
type: noop
operator: NOOPbaseline_trailing_days β trailing days for ML baseline
integer Β· optional
Number of trailing days for the ML baseline window. Used with drift metrics like PSI, KS_TEST, JS_DIVERGENCE. Minimum: 1.
alert_conditions:
- metric: PSI
operator: AUTO
baseline_trailing_days: 30baseline_start / baseline_end β fixed baseline period
string (ISO 8601 datetime) Β· optional
Fixed start and end of the baseline period. Used with drift/cardinality metrics. Must be a full ISO 8601 datetime (YYYY-MM-DDThh:mm:ss) β a date-only value like "2025-01-01" is rejected with "Not a valid datetime."
alert_conditions:
- metric: KS_TEST
operator: AUTO
baseline_start: "2025-01-01T00:00:00"
baseline_end: "2025-03-31T00:00:00"num_bins β histogram bins for drift metrics
integer Β· optional
Number of histogram bins for drift metrics (PSI, KS_TEST, JS_DIVERGENCE). Range: 2--1000.
alert_conditions:
- metric: PSI
operator: AUTO
num_bins: 50id β stable identifier for this alert condition
string Β· optional
Preserved across updates.
alert_conditions:
- metric: NULL_RATE
id: null-rate-emailobject Β· optional Β· default: system-managed schedule
Controls when the monitor runs. Supported modes: fixed, dynamic, manual. Crontab (interval_crontab) is not supported β use interval_minutes instead. Minimum interval_minutes is 60.
When using aggregate_by, interval_minutes must be a multiple of the bucket size:
aggregate_by | Minimum interval_minutes |
|---|---|
hour | 60 |
day | 1440 |
week | 10080 |
month | 43200 |
Omitting schedule means Monte Carlo runs the monitor on the default collection cycle (typically every 6--12 hours depending on table activity and your plan).
schedule:
type: fixed
interval_minutes: 1440Properties
type β schedule type
enum Β· optional Β· default: fixed
Accepted values: fixed Β· dynamic Β· manual
schedule:
type: dynamicinterval_minutes β run interval for fixed schedules
integer Β· optional
Must align with aggregate_by (e.g., daily aggregation requires a multiple of 1440). Minimum: 60.
schedule:
type: fixed
interval_minutes: 1440interval_crontab β crontab schedule
array of strings Β· optional
Not supported for metric monitors. Use interval_minutes instead. Custom SQL and validation monitors support crontab.
start_time β schedule start time
string Β· optional
ISO 8601 format.
schedule:
type: fixed
interval_minutes: 1440
start_time: "2025-01-01T06:00:00Z"timezone β schedule timezone
string Β· optional
Timezone identifier (e.g., America/New_York).
schedule:
type: fixed
interval_minutes: 1440
timezone: America/New_Yorkdynamic_schedule_tables β tables that trigger this monitor
array of strings Β· optional
Tables whose update events trigger this monitor (for dynamic schedules).
schedule:
type: dynamic
dynamic_schedule_tables:
- analytics:core.fct_ordersdynamic_schedule_jobs β jobs that trigger this monitor
array of objects Β· optional
Jobs whose completion triggers this monitor (for dynamic schedules).
| Property | Type | Required | Description |
|---|---|---|---|
job_type | string | yes | AdfJob Β· AirflowDag Β· DatabricksJob Β· DbtJob |
job_name | string | yes | Name of the job |
project_name | string | yes | Project or workspace containing the job |
task_name | string | no | Specific task within the job |
mcon | string | no | MCON identifier for the job |
schedule:
type: dynamic
dynamic_schedule_jobs:
- job_type: DbtJob
job_name: daily_build
project_name: analyticsmin_interval_minutes β minimum interval between dynamic runs
integer Β· optional
Minimum interval between runs for dynamic schedules.
schedule:
type: dynamic
dynamic_schedule_tables:
- analytics:core.fct_orders
min_interval_minutes: 60string Β· optional
Timestamp or date column used to bucket data by time. Omit for whole-table scans on each run. A DATE column limits you to daily (or coarser) aggregation; aggregate_by: hour on a DATE column produces meaningless buckets.
aggregate_time_field: created_atstring Β· optional
SQL expression that evaluates to a timestamp, used instead of aggregate_time_field when the time column needs transformation (e.g., epoch to timestamp).
aggregate_time_sql: "TO_TIMESTAMP(epoch_seconds)"enum Β· optional
Accepted values: hour Β· day Β· week Β· month
Must align with schedule.interval_minutes (e.g., daily requires a multiple of 1440, hourly requires a multiple of 60).
aggregate_by: daystring Β· optional
Timezone identifier (e.g., America/New_York).
aggregate_timezone: America/New_Yorkinteger Β· optional Β· default: 0
Number of hours to offset from the current period to account for late-arriving data (e.g., 24 for daily, 1 for hourly). Negative values are allowed to include one future time bucket (e.g., -24 for daily aggregation). Only valid when aggregate_by is set.
collection_lag: 2string Β· optional
SQL WHERE clause (without the WHERE keyword) to filter rows before metric computation.
where_condition: "status != 'deleted'"boolean Β· optional Β· default: false
Use the table's partition clause for efficient querying. Not allowed when data_source.sql is used.
use_partition_clause: truearray of strings Β· optional Β· default: []
Each unique combination of values creates a separate time series. Maximum 5 segment fields. Use segment_sql when you need to bucket or transform values.
segment_fields:
- region
- product_categoryarray of strings Β· optional Β· default: []
SQL expressions for segmentation when column names alone are insufficient. Use instead of segment_fields when you need to bucket or transform values.
segment_sql:
- "CASE WHEN country IN ('US','CA','MX') THEN 'NA' ELSE 'INTL' END"boolean Β· optional Β· default: false
Enable support for high-cardinality segmentation (more than the default segment limit).
high_segment_count: trueenum Β· optional
Accepted values: low Β· medium Β· high
Controls how aggressively AUTO thresholds flag anomalies.
| Value | Behavior |
|---|---|
low | Fewer alerts, only large deviations |
medium | Balanced |
high | More alerts, smaller deviations flagged |
sensitivity: mediumarray of strings (exactly one entry) Β· required on all accounts created after January 2025
Set default_domain in montecarlo.yml to avoid repeating it on every monitor.
domains:
- my-domainobject Β· optional
sampling_config:
percentage: 10Properties
percentage β percentage of rows to sample
number Β· optional
Percentage of rows to sample (0--100).
sampling_config:
percentage: 25count β fixed number of rows to sample
integer Β· optional
sampling_config:
count: 10000string Β· required
Required for monitors created after Jan 29, 2024 (existing monitors keep working). Changing the name creates a new monitor and deletes the old one β incident history does not transfer.
name: orders_row_countstring Β· optional Β· required if multiple warehouses
Warehouse UUID or name. Overrides default_resource from montecarlo.yml.
warehouse: my-snowflakestring Β· optional
Overrides the default connection.
connection_name: snowflake-analyticsinteger Β· optional
Query execution timeout in seconds.
timeout: 300array of objects Β· optional
| Property | Type | Required | Description |
|---|---|---|---|
name | string | yes | Tag key |
value | string | no | Tag value |
tags:
- name: team
value: analytics
- name: environment
value: productionenum Β· optional
Accepted values: P1 Β· P2 Β· P3 Β· P4 Β· P5
priority: P2array of strings Β· optional
Audience names linking this monitor to channels defined in Notifications as Code. In exported/rendered YAML, appears as labels.
audiences:
- data-engineering
- platform-alertsarray of strings Β· optional
Separate audiences for run-failure notifications. Falls back to audiences if not set.
failure_audiences:
- data-engineering-oncallenum Β· optional
Accepted values: ACCURACY Β· COMPLETENESS Β· CONSISTENCY Β· TIMELINESS Β· UNIQUENESS Β· VALIDITY
data_quality_dimension: COMPLETENESSstring Β· optional
Visible in the Monte Carlo UI. Not included in notifications.
notes: Owned by the analytics team. Reviewed quarterly.boolean Β· optional Β· default: false
Creates the monitor in a paused state. Omitting this on a later update resets to false (active) due to PUT semantics β always include it if you want the monitor to stay in draft.
is_draft: truestring Β· optional
Include the UUID of an existing monitor to update it instead of creating a new one.
uuid: 0dae7702-0950-45c7-909c-7e183bddca19Deprecated fields
| Field | Use instead |
|---|---|
resource | warehouse |
domain | domains |
domain_uuids | domains |
labels | audiences |
notify_rule_run_failure | notify_run_failure |
API-only fieldsSome fields visible in the API (
notify_run_failure,disable_look_back_bootstrap,skip_reset,fail_on_reset) are not available in MaC YAML.
Examples
Row count anomaly detection with daily aggregation
Detects unexpected changes in daily row volume using ML-based thresholds.
montecarlo:
metric:
- name: daily_orders_volume
description: Track daily order volume for anomaly detection
data_source:
table: analytics:core.fct_orders
aggregate_time_field: order_date
aggregate_by: day
alert_conditions:
- metric: ROW_COUNT_CHANGE
operator: AUTO
sensitivity: medium
schedule:
type: fixed
interval_minutes: 1440
audiences:
- data-engineering-alerts
priority: P2
data_quality_dimension: COMPLETENESS
domains:
- my-domain
tags:
- name: team
value: analyticsNull rate monitoring with explicit threshold
Alerts when null rates on critical fields exceed 5%.
montecarlo:
metric:
- name: customer_null_rate_check
description: Alert when null rate on email or phone exceeds 5%
warehouse: my-snowflake-warehouse
data_source:
table: raw:crm.customers
aggregate_time_field: updated_at
aggregate_by: day
alert_conditions:
- metric: NULL_RATE
operator: GT
threshold_value: 0.05
fields:
- EMAIL
- PHONE_NUMBER
schedule:
type: dynamic
dynamic_schedule_tables:
- raw:crm.customers
audiences:
- crm-data-quality
priority: P3
data_quality_dimension: COMPLETENESS
domains:
- my-domainSegmented metric with custom SQL expression
Monitors average order value per region, using a custom metric and SQL-based segmentation.
montecarlo:
metric:
- name: avg_order_value_by_region
description: Track average order value segmented by sales region
data_source:
table: analytics:core.fct_orders
aggregate_time_field: created_at
aggregate_by: day
segment_sql:
- "CASE WHEN country IN ('US','CA','MX') THEN 'NA' WHEN country IN ('GB','DE','FR') THEN 'EU' ELSE 'OTHER' END"
high_segment_count: false
alert_conditions:
- custom_metric:
display_name: Average Order Value
sql_expression: "AVG(order_total)"
operator: AUTO
schedule:
type: fixed
interval_minutes: 1440
audiences:
- revenue-monitoring
priority: P2
domains:
- revenue-domainTroubleshooting
Metric names
Use the canonical metric names from Available Metrics. Common mistakes:
AVG/MEANβ useNUMERIC_MEANMINβ useNUMERIC_MINMAXβ useNUMERIC_MAXSTDDEVβ useNUMERIC_STDDEVSUMstaysSUMβ there is noNUMERIC_SUMAPPROX_DISTINCT_COUNT/COUNT_DISTINCTβ useUNIQUE_COUNTCOUNT_NULLβ useNULL_COUNTROW_COUNTis not a column metric β the table-level metric isROW_COUNT_CHANGE
Operators and alert conditions
- Explicit operators on pipeline metrics fail.
ROW_COUNT_CHANGE,TIME_SINCE_LAST_ROW_COUNT_CHANGE, andRELATIVE_ROW_COUNTonly supportAUTO.GT,LT, or any explicit operator produces an error. NEis not valid. The inequality operator isNEQ.alert_conditionsis required. A metric monitor with none fails validation.
Fields and data source
- Don't pass
fieldsfor table-level metrics. Metrics likeROW_COUNT_CHANGEoperate on the whole table; includingfieldscauses a validation error. use_partition_clauseis table-source only. Combininguse_partition_clause: truewithdata_source.sqlcauses a validation error.- Verify column names. They're case-sensitive on most warehouses (Snowflake returns uppercase). Check the actual table schema before writing alert conditions.
Schedules
- Bucket size must fit the interval.
aggregate_by: daywithinterval_minutes: 60fails validation β the interval must be at least as large as the bucket size. - No crontab on metric monitors. Use
interval_minutes; only custom SQL, validation, and metric comparison monitors supportinterval_crontab. interval_minutesminimum is 60. A lower value produces: "Metric monitors must have a interval_minutes >= 60."
Updates and deprecated fields
- PUT semantics on updates. When updating a monitor by
uuid, every field you omit reverts to its default β it is not left unchanged. Always specify the complete desired configuration. - Prefer
warehouseoverresource. Theresourcefield still works but is deprecated.
Updated about 2 hours ago
