Validation Monitors

Validation monitors help to identify rows in your tables with bad data ("invalid rows"). These monitors help maintain the integrity and reliability of data by continuously checking for various quality attributes. They make it easy for users with limited or no SQL knowledge to set up alert conditions on specific fields or a combination of fields.

Users can setup and get notified on predefined or custom defined (SQL expression) alert conditions, such as validating that strings are formatted correctly, timestamps are not in the future, and numerical values are not negative.

For example, users can get alerted if their field test_string is email:

Screenshot 2024-07-01 at 3.55.49 AM.png

Validation monitors can also help users check if the data type of a chosen field is valid for a specified alert condition:

Screenshot 2024-07-01 at 4.00.36 AM.png

Example: test_date should by a numeric for alert condition - test_date > 100

Creating Validation Monitors

To create a Validation monitor in Monte Carlo, visit the monitors page, click the blue Create Monitor in the upper right hand corner, look for Validation monitor on the left hand side, and hit Create.

1. Choose Data

Users have two options to choose data from.

  1. Select a table or view:

    Screenshot 2024-07-01 at 4.12.18 AM.png
  2. Write a SQL query:

    • Select warehouse
    • Write a SQL query and click Test to validate the SQL.
    Screenshot 2024-07-01 at 4.20.47 AM.png

    Test will help users preview all the fields the SQL query returns.

    Screenshot 2024-07-01 at 4.23.01 AM.png

    Example: SQL query can be something like select * from table_name

2. Define alert conditions

There are 3 types of alert conditions that users can setup

  1. Predefined Conditions

    Predefined conditions start with is operator. Conditions are specific to the data type of field.

    Screenshot 2024-07-01 at 4.29.41 AM.png Screenshot 2024-07-01 at 4.29.54 AM.png
  2. Conditions with values set by user

    These conditions have standard operators for data types like

    • numerics: equal, greater_than, greater_than_equal, less_than, less_than_equal, in_set, not_in_set
    • date, time: equal, greater_than, greater_than_equal, less_than, less_than_equal
    • string: equal, contains, starts_with, ends_with, matches_regex, in_set, not_in_set
    • boolean: equal

    The user needs to provide a field and a value for these conditions.

    Screenshot 2024-07-01 at 4.33.53 AM.png Screenshot 2024-07-01 at 4.34.26 AM.png Screenshot 2024-07-01 at 4.34.56 AM.png
  3. SQL expressions

    For greater flexibility users can also pass in a SQL expression as a condition and test it as well.

    Screenshot 2024-07-01 at 4.41.16 AM.png

Users can use all of these above conditions to create more complex alert conditions to alert on.

For example: alert conditions combinations like test_string = ‘qwerty’ AND (test_number > 100 OR test_boolean = True) AND test_date > July 12, 2024 will look like this:

3. Define Schedule

Select when the monitor should run. There are three options:

  • On a schedule: input a regular, periodic schedule. Options for handling daylight savings are available in the advanced dropdown.
  • CRON based: input a cron expression to run on a schedule specified by the cron. Options for handling daylight savings are available in the advanced dropdown.
  • Manual trigger: Select this option when you want to manually trigger using Run button on the monitor description page.

4. Send notifications

Select which audiences should receive notifications when an anomaly is detected.

Text in the Notes section will be included directly in notifications. The "Show notes tips" dropdown includes details on how to @mention an individual or team if you are sending notifications to Slack.

Additional settings exist for setting the description of the monitor which will be set as the header for notifications sent to notification channels, also pre-setting a priority on any incidents generated by the monitor.

See here for more information on Audiences and Notifications.


Alerts and Notifications

Alerts Page

Alert page shows list of Alerts. Upon clicking the Alert Details page shows:

  • Number of rows breached the alert conditions
  • Invalid rows
  • Graph of previous breaches

Notifications

Validation Monitors alert when the alert conditions passed for the data source yield rows > 0.

The notifications shown to the user contains:

  • Description of the Validation
  • Number of rows breached the alert condition
  • Last breach info

Examples

Validate that field values are not null

In this case we need to alert if test_field is null:

Validate that numerical values are not negative

In this case we need to alert if numeric_field is negative:

Validate that strings are formatted correctly

In this case we need to alert if string_field is not a specified format, say, uuid - string_field is not UUID:

Or alert if us_phone_field is not US phone number:

See here for more formats.

Validate that timestamps are not in the future

In this case we need to alert if timestamp_field is not in the future:

See here for more valid conditions.

Available Conditions

Threshold Types Definitions:

  • Predefined Value: Monte Carlo has already defined the threshold. For example, to check if a number is negative or if the field is null. User does not need to define any threshold.
  • Manual: User defined thresholds

Completeness

ConditionField TypesThreshold TypesMaC Name
not a number (NaN)numericPredefined Valueis_nan
zeronumericPredefined Valueis_zero
null stringstringPredefined Valuenull_string
containing spacesstringPredefined Valuecontaining_spaces
empty stringstringPredefined Valueempty_string
nullnumeric, date, time, stringPredefined Valuenull
all spacesstringPredefined Valueall_spaces

Comparison

ConditionField TypesThreshold TypesMaC Name
equal tonumeric, date, time, stringManualequal
greater thannumeric, date, timeManualgreater_than
greater than or equal tonumeric, date, timeManualgreater_than_or_equal
less thannumeric, date, timeManualless_than
less than or equal tonumeric, date, timeManualless_than_or_equal
containsstringManualcontains
starts withstringManualstarts_with
ends withstringManualends_with
is in setnumeric, stringManualin_set

Format

ConditionField TypesThreshold TypesMaC Name
emailstringPredefined Valueemail
timestamp format ISO-8601stringPredefined Valuetimestamp_iso_8601
timestamp format DD-MM-YYYYstringPredefined Valuetimestamp_dd-mm-yyyy
timestamp format DD/MM/YYYYstringPredefined Valuetimestamp_dd/mm/yyyy
timestamp format MM-DD-YYYYstringPredefined Valuetimestamp_mm-dd-yyyy
timestamp format MM/DD/YYYYstringPredefined Valuetimestamp_mm/dd/yyyy
"True", "TRUE", or "true"stringPredefined Valuetrue_string
"False", "FALSE", or "false"stringPredefined Valuefalse_string
UUIDstringPredefined Valueis_uuid
Canada postal codestringPredefined Valueca_postal_code
Canada SINstringPredefined Valuecan_sin
France postal codestringPredefined Valuefr_postal_code
France INSEE codestringPredefined Valuefr_insee_code
Germany postal codestringPredefined Valuede_postal_code
Germany Tax IDstringPredefined Valuede_tax_id
Ireland postal codestringPredefined Valueie_postal_code
Ireland PPSNstringPredefined Valueie_ppsn
Italy postal codestringPredefined Valueit_postal_code
Italy fiscal codestringPredefined Valueit_fiscal_code
Netherlands postal codestringPredefined Valuenl_postal_code
Netherlands Citizen Service Number (BSN)stringPredefined Valuenl_bsn
Poland postal codestringPredefined Valuepl_postal_code
Poland National Identification Number (PESEL)stringPredefined Valuepl_pesel
Spain postal codestringPredefined Valuees_postal_code
Spain National Identity Document Number (DNI)stringPredefined Valuees_dni
Switzerland postal codestringPredefined Valuech_postal_code
Switzerland Social Security Number (SSN)stringPredefined Valuech_oasi
Turkey postal codestringPredefined Valuetr_postal_code
Turkey ID numberstringPredefined Valuetr_id_no
UK postal codestringPredefined Valueuk_postal_code
UK National Insurance Number (NINO)stringPredefined Valueuk_nino
US ZIP code (e.g., 94109)stringPredefined Valueus_zip_code
US phone numberstringPredefined Valueus_phone_number
US state codestringPredefined Valueus_state_code
US SSNstringPredefined Valueus_ssn
matches regexstringManualmatches_regex

Numeric Range

ConditionField TypesThreshold TypesMaC Name
between 0 and 100 (inclusive)numericPredefined Valueis_between_0_and_100
between 0 and 1 (inclusive)numericPredefined Valueis_between_0_and_1
negativenumericPredefined Valueis_negative

Time Range

ConditionField TypesThreshold TypesMaC Name
Sundaydate, timePredefined Valuesunday
Mondaydate, timePredefined Valuemonday
Tuesdaydate, timePredefined Valuetuesday
Wednesdaydate, timePredefined Valuewednesday
Thursdaydate, timePredefined Valuethursday
Fridaydate, timePredefined Valuefriday
Saturdaydate, timePredefined Valuesaturday
Weekdaydate, timePredefined Valueweekday
in the past calendar weekdate, timePredefined Valuein_past_calendar_week
in the past calendar monthdate, timePredefined Valuein_past_calendar_month
in the past 7 daysdate, timePredefined Valuein_past_7_days
in the past 30 daysdate, timePredefined Valuein_past_30_days
yesterdaydate, timePredefined Valueyesterday
todaydate, timePredefined Valuetoday
in the futuredate, timePredefined Valuein_future
in the pastdate, timePredefined Valuein_past
in the past 365 daysdate, timePredefined Valuein_past_365_days