Validation Monitors
Validation monitors help to identify rows in your tables with bad data ("invalid rows"). They make it easy for users with limited or no SQL knowledge to set up alert conditions on specific fields or a combination of fields. For example, to validate that strings are formatted correctly, timestamps are not in the future, and numerical values are not negative.
Users can choose from a rich library of operators and templated alert conditions, or create their own using custom SQL. See Available Conditions for a full list of operators and templated conditions.
Creating Validation Monitors
Validation monitors can be created from the Create Monitor page or Assets page. Configuration steps include:
1. Choose Data
Users have two options to choose data from.
-
Select a table or view:
-
Write a SQL query:
- Select warehouse
- Write a SQL query and click
Test
to validate the SQL.
Test
will help users preview all the fields the SQL query returns.
2. Define alert conditions
There are 3 types of alert conditions that users can setup
-
Predefined Conditions
Predefined conditions start with
is
operator. Conditions are specific to the data type of field. -
Conditions with values set by user
These conditions have standard operators for data types like
- numerics: equal, greater_than, greater_than_equal, less_than, less_than_equal, in_set, not_in_set
- date, time: equal, greater_than, greater_than_equal, less_than, less_than_equal
- string: equal, contains, starts_with, ends_with, matches_regex, in_set, not_in_set
- boolean: equal
The user needs to provide a field and a value for these conditions.
-
SQL expressions
For greater flexibility users can also pass in a SQL expression as a condition and test it as well.
Users can use all of these above conditions to create more complex alert conditions to alert on.
For example: alert conditions combinations like test_string = ‘qwerty’ AND (test_number > 100 OR test_boolean = True) AND test_date > July 12, 2024
will look like this:
3. Define Schedule
Select when the monitor should run. There are three options:
- On a schedule: input a regular, periodic schedule. Options for handling daylight savings are available in the advanced dropdown.
- CRON based: input a cron expression to run on a schedule specified by the cron. Options for handling daylight savings are available in the advanced dropdown.
- Manual trigger: Select this option when you want to manually trigger using
Run
button on the monitor description page.
4. Send notifications
Select which audiences should receive notifications when an anomaly is detected.
Text in the Notes section will be included directly in notifications. The "Show notes tips" dropdown includes details on how to @mention an individual or team if you are sending notifications to Slack.
Additional settings exist for setting the description of the monitor which will be set as the header for notifications sent to notification channels, also pre-setting a priority on any incidents generated by the monitor.
See here for more information on Audiences and Notifications. Learn more about setting Monitor tags.
Alerts and Notifications
Alerts Page
Alert page shows list of Alerts. Upon clicking the Alert Details page shows:
- Number of rows breached the alert conditions
- Invalid rows
- Graph of previous breaches
Notifications
Validation Monitors alert when the alert conditions passed for the data source yield rows > 0.
The notifications shown to the user contains:
- Description of the Validation
- Number of rows breached the alert condition
- Last breach info
Examples
Validate that field values are not null
In this case we need to alert if test_field is null:
Validate that numerical values are not negative
In this case we need to alert if numeric_field is negative
:
Validate that strings are formatted correctly
In this case we need to alert if string_field
is not a specified format, say, uuid - string_field is not UUID
:
Or alert if us_phone_field is not US phone number
:
See here for more formats.
Validate that timestamps are not in the future
In this case we need to alert if timestamp_field is not in the future
:
Legacy monitors now included in Validation Monitors
Up until September 2024, Cardinality Rules and Referential Integrity Rules were options on the Monitor Menu in Monte Carlo. These were purpose-built monitor creation experiences that produced a SQL Rule. These were for use cases like:
- Alert me if any of the values in [field] are not included in set [value1, value2, value3, etc]
- Alert me if any of the values in [field] are not present in [table > field]
These were made redundant by Validations, so the experiences to create new ones were removed from the UI. Existing Cardinality Rules and Referential Integrity Rules continue to function.
In Validation Monitors, the recommended way to address these use cases in now with the Is in set
and Is not in set
operators, which allow a user to define a set:
- From a list: manually enter the values
- From a field: select a table and field, and the set with be populated with the distinct values in that field. The selected field will be referenced each time the monitor is run, so the set values may change.
- From a query: write SQL to define the values in the set. The query will run each time the monitor is run, so the set values may change.
Updated 3 months ago