Partition Filtering for Monitors
When Monte Carlo writes monitor queries to execute on your warehouse or lake, it uses the time-based partitions that are known based on metadata collection. Users do not need to specify partitions when creating most monitors, as they will be automatically inferred.
When creating a Field Health or Dimension Tracking Monitor, under Advanced Options there is an option to Enable auto-partition filtering. It is on by default if a time-based partition is detected. If no time-based partition is detected on the table, this option will not be available.
- Hive and Glue Data Lakes
Types of Queries Supported
Monte Carlo automatically detects and filters for date partitions for tables when writing queries for the following functionality:
- Field Health Monitors
- Dimension Tracking Monitors
- Root Cause Analysis Queries
For monitors where Monte Carlo does not write the SQL automatically (like Custom Monitors - SQL Rules), this feature is not supported. It is available for Field Heath and Dimension Tracking Monitors. Monitors that use All Records instead of Recent Data are not supported.
Formats of Partitions Supported
Partitions must be stored in one of the following formats in order to be inferred by Monte Carlo:
- Timestamp Fields: any native timestamp field in the listed supported integrations
- Date or String Fields - in the following formats (based on standard C formatting):
If your partitions are separated into multiple String or Numeric fields, they must meet both the naming convention and formats below.
Field Naming Convention
- Hour (optional):
Field Supported Formats:
- BigQuery table partition of
- Databricks table partitioned by
Updated about 2 months ago