Tuning thresholds

As described in the overview, adjusting training data and sensitivity are the two ways for users to tune thresholds.

By adjusting training data, the user can indicate which periods of time represent normal behavior for a particular metric (for example, the row count of a particular table). The anomaly detection models will then train on this data and produce thresholds, which can then be further widened or narrowed using sensitivity.

Sensitivity

The most common way of tuning a threshold is to change a monitor's sensitivity. All automated thresholds support low, medium (default), and high levels of sensitivity. Low sensitivity will widen thresholds, resulting in fewer anomalies. High sensitivity will narrow ("tighten") thresholds, resulting in fewer anomalies.


Training data

There are two key ways to manage which data is included in training models:

  • Managing alerts: by default, anomalies detected by Monte Carlo are not excluded from the set of training data. As a result, thresholds will often widen after an anomaly. To bring a threshold back to normal, mark the status of the alert as "Fixed". The anomaly will then be ignored when training models, restoring the original threshold.
  • Exclusions: using exclusion windows, users can define periods of time that should be ignored when training models. These can be created for an entire warehouse, database, schema, or table. They can be one-off or set for recurring holidays.

Training data for volume monitors

Some customers have access to a new system of managing training data. It is available only for volume monitors. If successful, this system will be rolled out to all customers and for all monitor types.

The primary change in this new system is that anomalies are excluded by default from the set of training data. As a result, thresholds do not widen after an anomaly is detected.

From the Asset page or from the Alert page, the user has the option to:

  • Mark as normal: when users review an alert, they can "mark as normal" for that anomaly to be re-introduced in the set of training data. The threshold will then widen, and similar anomalies will not be alerted to in the future.
  • Select training data: by interacting with the chart of the monitor, users can exclude periods of data from training models. This is the same concept as exclusion windows, except it is at the monitor-level.

After taking either of these actions, it can take several hours before the new thresholds are visible.

If users do not want to be alerted to similar anomalies in the future, they can "Mark as normal". The anomalous data point will then be re-introduced into the training set and thresholds will widen.

If users do not want to be alerted to similar anomalies in the future, they can "Mark as normal". The anomalous data point will then be re-introduced into the training set and thresholds will widen.


After click 'Select training data', users can define periods of time to exclude from training models. This gives users control to ensure bad data is not influencing thresholds.

After click 'Select training data', users can define periods of time to exclude from training models. This gives users control to ensure bad data is not influencing thresholds.