Metric Monitors
Metric monitors detect anomalies for dozens of available metrics, or for custom metrics defined by the user. In one monitor, you can monitor many different metrics across many different fields on a given table. Metric monitors can also be easily segmented, allowing the user to isolate and "louden" anomalies that could otherwise be diluted and missed.
Metric monitors can be created from the Create Monitor page or Assets page. The Recommend configurations button gives the option to automate some aspects of configuration. Configuration steps include:
Choose data
Select which table or view to monitor, and how to aggregate, filter, and segment the data. Settings include:
Select table: pick which table or view to monitor.
Aggregation: indicate if rows should be aggregated into hourly, daily, or all-records buckets. When aggregating by hour or day:
- Select a time field (or compose one using SQL) from which the aggregation will be based.
- Select a rolling time window that will filter how far back the monitor will query. By limiting how much data the monitor is querying, the monitor is more efficient and consumes fewer resources.
WHERE clause: apply any additional filters. For example, a specific segment of data you would like to monitor.
Select fields to segment the data: select up to 2 fields to segment the data by. Common fields to segment by include products, regions, event types, versions, and merchants. Read more about segmentation.
Define alert conditions
Select the metrics and fields to monitor, and the threshold to generate an incident. See the full list of available metrics or create your own custom metric.
There is no limit on how many metrics you can include in a single monitor. Once you select a metric, you can then select specific fields or All supported fields.
Most metrics support both automated and manual thresholds. If All supported fields is selected, then only automated thresholds are available.
The operators for alert conditions are self-explanatory, with the exceptions of:
- Not between: this is exclusive of the bounds. For example, "Alert when mean for sale_price is not between 500 and 1,000" could be rephrased as "Alert when mean for sale_price is <500 or >1,000."
- Between: this is inclusive of the bounds. For example, "Alert when mean for sale_price is between 500 and 1,000" could be rephrased as "Alert when mean for sale_price is >=500 and <=1,000."
Define schedule
Select when the monitor should run. There are two options:
- On a schedule: input a regular, periodic schedule. Options for handling daylight savings are available in the advanced dropdown.
- When the table is updated: the monitor will run when Monte Carlo sees that the table has been updated. This logic uses the history of table updates that Monte Carlo gets through its hourly collection of metadata.
Send notifications
Select which audiences should receive notifications when an anomaly is detected.
Text in the Notes section will be included directly in notifications. The "Show notes tips" dropdown includes details on how to @mention an individual or team if you are sending notifications to Slack.
Additional settings exist for customizing the description of the monitor, pre-setting a severity on any incidents generated by the monitor, or for turning off failure notifications.
Notes about automated thresholds
The automated thresholds for some metrics, such as Null (%) and Unique (%), are optimized for the extremes. In other words, if a column rarely or never sees nulls, then the threshold becomes very very sensitive. But if there is a lot of volatility (e.g. where we see 30-50% null rate), and especially if that volatility is away from the extremes (e.g. 40-60% vs 0-1%), then it becomes insensitive. This is to prevent lots of false positives and noise.
In addition, to mitigate against swings in metrics that can result in false positives and noise, Metric monitors require at least 50 rows per metric measurement when using automated thresholds. Measurements with less than 50 rows are not considered in the training set, and will not generate anomalies. In other words, when configuring the Choose datasection, you should seek to aggregate metrics into buckets of at least 50 rows.
"Metric - legacy" monitors
Metric monitors combine the functionality from two deprecated monitor types: Field Health and Field Quality Rules. In April 2024, their functionality was brought together in Metric monitors.
Some customers have Metric - legacy monitors in their environment, which are historical Field Quality Rules. These still function, and are simply renamed for design consistency. They were not automatically converted to Metric monitors because they have certain backend differences that result in slightly different behavior.
Updated about 2 months ago