Insights

Access Monte Carlo generated metadata

We offer multiple mechanisms to access the metadata Monte Carlo collects to support a range of analytical and tracking use cases. Many of our customers today access this data to define and track SLI performance, monitor data asset usage, determine the most important (or least important) data assets, and more.

The following document outlines the different options you have for accessing this data.

Dashboard

Users can easily download CSV reports right from the UI. Navigate to the "Dashboard" tab and click to download the CSV reports:

Insight ReportsInsight Reports

Insight Reports

CLI

Users can also leverage the CLI to programmatically download all CSV reports and/or upload directly to S3.

Supported schemas:

  • file:// - save insight locally.
  • s3:// - save insight to S3.

Follow this guide to install and configure the CLI. For reference, see help for these commands:

❗️

These commands will overwrite a file if it exists in the path and create any missing directories or prefixes.

$ montecarlo insights
Usage: montecarlo insights [OPTIONS] COMMAND [ARGS]...

  Aggregated insights on your tables.

Options:
  --help  Show this message and exit.

Commands:
  get-cleanup-suggestions    Get cleanup suggestions insight.
  get-coverage-overview      Get coverage overview (monitors) insight.
  get-deteriorating-queries  Get deteriorating queries insight.
  get-events                 Get events insight.
  get-incident-queries       Get incident query changes insight.
  get-key-assets             Get key assets insight.
  get-rule-results           Get rule and SLI results insight.
  get-table-activity         Get table read/write activity insight.
  list                       List insights details and availability.
# Save an insight locally to a directory called 'mc_data' with filename 'assets.csv'
$ montecarlo insights get-key-assets file://mc_data/assets.csv

# Save an insight to S3 bucket called 'bucket' with key 'mc_data/alerts.csv'
$ montecarlo insights get-events s3://bucket/mc_data/alerts.csv

# List all insights details and availability
$ montecarlo insights list

👍

Snowflake Data Marketplace

Customers can request access to a subset of the Insight reports directly from their Snowflake environment via Snowflake's Data Sharing capabilities for the following Snowflake data warehouses:

  • AWS: us-east-1, us-west-2, ca-central-1, eu-west-1, eu-central-1
  • Azure: us-east-2, eu-west

Reach out to your CSM or to [email protected] to learn more about this feature!

Insights Inventory

Key Assets
Tables and views with a calculated Importance Score (based on # dependencies, avg. reads/writes, users, and more). The lookback window for this insight is 30 days.
Use this to:

  • Identify important tables to add custom monitors to
  • Communicate table/view usage to stakeholders
  • Prioritise datasets for refactoring or migrations

Coverage Overview
Active monitors that Monte Carlo is applying to your data, including all out-of-the-box and advanced monitors.
Use this to:

  • Quantify MC monitor coverage across schemas and tables
  • Report increase in custom monitor adoption over time

Table Cleanup Suggestions
Dormant tables and views with no recent query activity
Use this to:

  • Reduce storage costs by deleting unused tables
  • Deprioritize unused datasets and tables during migrations

Field-level Cleanup Suggestions
Dormant fields with no recent query activity.
Use this to:

  • Prioritize which tables are most ripe for cleanup
  • Simplify tables by removing unused fields

Events
Details for all anomalies and schema changes detected by Monte Carlo, which then get grouped together into incidents. The lookback window for this insight is 90 days.
Use this to:

  • Understand the threshold for what triggered an anomaly
  • See which tables or views are consistently changing or unreliable

Table Read/Write Activity
Table read and write counts by day/week/month for each table in your data warehouse or data lake.
Use this to:

  • Aggregate read/write query activity by schema, teams or other filter
  • Gauge importance and prioritise development on tables with the highest read/write activity

Incident Query Changes
Shows changes in query patterns that may have led to an Incident on a table. Includes Incidents for the past two weeks and has a 36 hour delay.
Use this to:

  • Identify if an Incident was caused by a new or updated query
  • Identify if an Incident was caused by a recurring query that ceased

Deteriorating Queries
Queries displaying a consistent increase in execution runtime over the past 30 days. This insight will be available only if Monte Carlo detects anomalies in query runtime.
Use this to:

  • Prevent data downtime by identifying queries that are at risk of timing out
  • Optimise queries that are not scaling well
  • Reduce cost by editing queries with increasing compute time

Rule and SLI Results
Log of query outputs and pass/breach results for SQL Rules and SLIs for the past 90 days
Use this to:

  • Track progress towards SLAs or SLOs by aggregating SLI results over time
  • Fine-tune Rules by understanding Rule results over time

Dimension Tracking Suggestions
Top 500 field recommendations for Dimension Tracking monitors, based on Table Importance Score (see Key Assets) and by using NLP on field names.
Use this to:

  • Identify fields for Dimension Tracking monitors

Field Health Suggestions
Top 100 table recommendations for Field Health monitors, based on Table Importance Score (see Key Assets). Note that we exclude external tables as they tend to consume more compute.
Use this to:

  • Identify tables for Field Health monitors

Misconfigured Monitors
Custom monitors (e.g. Field Health) that were configured in a way that won't result in meaningful anomaly detection, along with suggestions on how to fix them.
Use this to:

  • Ensure custom monitors are working as intended
  • Clean up outdated custom monitors

BI Dashboard Analytics
Usage data for Looker dashboards such as totals views and days since last view.
Use this to:

  • See which dashboards are safe to deprecate or delete
  • Prioritise heavily used dashboards in a migration

Incident History
All incidents detected by Monte Carlo over the last 6 months. The lookback window for this insight is 1 year.
Use this to:

  • Report on the statuses of your Incidents
  • Calculate Incident Response Rate and Time to First Response
  • See trends in your anomalies and schema changes

Heavy Queries
Queries from the last week with the longest runtime and most bytes scanned. The top 5 queries are shown for each warehouse and executor.
Use this to:

  • Find and optimise queries that are causing performance issues or stressing resources.
  • Identify irresponsible query behaviour

Did this page help you?