Table Monitors

Settings > Table Monitors lets you manage what assets are ingested into Monte Carlo and have Table Monitors applied to them:

Ingested tables are all assets (tables, views, external tables) with metadata and lineage in Monte Carlo.
Table monitors allow you to view freshness, volume, and schema changes for an asset, and be alerted to anomalies.

Users can apply Table Monitors to all assets in a database, schema, data product, or they can set filters. Filters can use asset tags, regex matching on asset name, or recent query history (example: read queries in last 14 days).

Depending on your contract structure with Monte Carlo, having a Table Monitor may or may not be a pre-requisite for applying any other kinds of monitors (such as Metrics or Validations) to that asset.

Settings > Table Monitors shows how many Assets (tables, views, external tables) are ingested and how many have Table Monitors applied.

🕗
Timing of Chart Updates
Though changes to ingestion and table monitoring go into effect immediately, the charts in Settings > Table Monitors can take up to 8 hours to update after changes are made.

Access

The Usage UI is accessible to users with Account Owner, Domain Manager or Editor roles. Learn more about these roles under Managed Roles and Groups.

Configuration

Ingestion

Settings > Table Monitors will show a list of Warehouses and Data Lakes that have been integrated to Monte Carlo. Select an integration to configure by clicking on its name.
After clicking into a Warehouse or Data Lake, you will see a list of Databases that Monte Carlo can see. Each Database can be toggled on/off for ingestion. Select the Database name you want to configure or select "Set up" under the Monitored tables column.
Within each Database, you will see a list of visible schemas. You can control which schemas to ingest by specifying rules. By default, all schemas are ingested into Monte Carlo and exclude rules can be applied. Select "Add exception rule" to add a new rule. Multiple rules can be added. You may specify schemas by name with the following conditions:
1. Exact match
2. Starts with
3. Ends with
4. Contains
5. Matches Regex
After selecting "Save", the "Ingested" and "Excluded from Ingestion" tabs at the bottom will be updated to reflect what schemas in that database are being included/exclude from ingestion. The numbers in the header will also be updated to reflect the total counts on ingested and monitored tables

🚧
Best practices when using regex pattern for ingestion
Please adhere to the following guidelines to ensure clarity, compatibility, and efficiency for your ingestion. Misconfigured or burdensome regexes can hamper the critical ingestion pipeline:

Simplicity in Patterns: Utilize straightforward regex patterns. Complicated constructs like negative lookaheads and word boundaries are not fully supported.

Automatic Anchors: Our system automatically prepends ^ and appends $ to your regex patterns. Please do not include these characters in your patterns. If you need to use them, please reach out to our support team to learn more.

Ingestion rules are sensitive to the specific integration they are used for. For clarity, all regex patterns must adhere to POSIX standards. We strongly recommend selecting regex patterns that are compatible with the following technologies:

PostgreSQL

Python 3+ : re library

The applicable database: the same regex pattern needs to work when querying the specific database, as the rule is evaluated there.

Table Monitors

Under the appropriate Database on Settings > Table Monitors, select the Schema that contains tables you want to add Table Monitors to.
By default, no tables are monitored in a Schema. Select "All tables" or "Select tables" at the top under Monitor rules to add Table Monitors. For "Select tables" there are several rules types you can use to both include or exclude tables from being monitored:

Rule type	Description	Supported operators
Table name	Match off of a table name	Starts with Ends with Contains Matches pattern
Table type	Match off of the type of table	is
Table tag	Match off of tags that exist on a table	is one of
Read for write activity	Defined as read or write activity to the table within the specified range.	7-31 days
Write activity	Write activity on the table within the specified range	7-31 days
Read activity	Read activity on the table within the specified range	7-31 days
All tables	Matches all tables	--

After selecting "Save", the With Table Monitors and Without Table Monitors columns will update. The numbers in the header will also reflect the updated total counts.

📘
Rules are case-sensitive
Be aware that schema and table names are case-sensitive when specifying patterns for exclusion from ingestion or inclusion in monitoring.

📘
Using the "Table Name: Matches pattern" rule
Use "*" to match one or more characters.
For example, specifying the pattern "prod_*_table" would match a table by the name "prod_1_18_snapshot_table"

📘
Tip: Exclude just a few tables
Specify "All tables" in your monitor rules then add a few "Except table name" rules if you want to monitor most of a schema but exclude a few tables.

Export list of assets with Table Monitors

A CSV export can be downloaded from the Tables Monitors card in Settings > Table Monitors. The download will include all current monitored tables at that point in time. Changes to the monitoring rules in the Usage UI will be immediately reflected in any subsequent downloads of the csv.

Columns included in the export: Integration, Database, Schema, Table Name, Type, Importance Score, Last Activity

CLI

Management of the collection block list is supported on CLI v0.40.0+. View CLI docs here: https://clidocs.getmontecarlo.com/

You can see which schemas and entities you already have specified to be blocked from collection using the get-collection-block-list command.

% montecarlo management get-collection-block-list --help
Usage: montecarlo management get-collection-block-list [OPTIONS]

  List entities blocked from collection on this account.

Options:
  --resource-name TEXT  Name of a specific resource to filter by. Shows all
                        resources by default.
  --help                Show this message and exit.

You can make changes to the collection block list using the update-collection-block-list command.

% montecarlo management update-collection-block-list --help
Usage: montecarlo management update-collection-block-list [OPTIONS]

  Update entities for which collection is blocked on this account.

Options:
  --add / --remove                Whether the entities being specified should
                                  be added or removed from the block list.
                                  [required]
  --resource-name TEXT            Name of a specific resource to apply
                                  collection block to. This option cannot be
                                  used with 'filename'. This option requires
                                  setting 'project'.
  --project TEXT                  Top-level object hierarchy e.g. database,
                                  catalog, etc. This option cannot be used
                                  with 'filename'. This option requires
                                  setting 'resource-name'.
  --dataset TEXT                  Intermediate object hierarchy e.g. schema,
                                  database, etc. This option cannot be used
                                  with 'filename'. This option requires
                                  setting 'resource-name', and 'project'.
  --collection-block-list-filename TEXT
                                  Filename that contains collection block
                                  definitions. This file is expected to be in
                                  a CSV format with the headers resource_name,
                                  project, and dataset. This option cannot be
                                  used with 'resource-name', 'dataset', and
                                  'project'.
  --help                          Show this message and exit.

Resources are Monte Carlo integrations
Projects would be a metastore in Databricks (like hive_metastore) or database in Redshift
Datasets would be a schema in Databricks or Redshift

GraphQL API

Manage which databases or schemas are excluded from ingestion.
- Get list of rules: GetCollectionBlockList
- Add a rule: addToCollectionBlockList
- Remove a rule: removeFromCollectionBlockList
- Update the list of rules: ModifyCollectionBlockList
  - ⚠️Caution⚠️ The complete list of rules to apply must be specified in the update. Take care to not wipe out all your rules!
Manage which tables are included for Table Monitors.
- Get list of rules: getMonitoredTableRuleList
- Update the list of rules: updateMonitoredTableRuleList. This also be used to remove/add new rules.
  - ⚠️Caution⚠️ The complete list of rules to apply must be specified in the update. Take care to not wipe out all your rules!

Limitations

The following Integrations are not currently supported for configuration under Settings > Table Monitors:
- Glue
- Pinecone
- Kafka; Confluent
There is currently a limit of 100 table monitor/except rules within a given schema.
The criteria for a rule may only contain up to 255 characters.