Data Collection: Details per Integration
See below for information regarding where & how Monte Carlo collects data per integration
Integration | Metadata | Query Logs | Freshness | Volume | Advanced Monitors |
---|---|---|---|---|---|
Redshift | 1 hour from information schema | 10 minutes from internal log | Calculated from query logs, based on queries that are deemed to update tables | Taken from metadata information every hour | Collected through SQL queries, based on user configuration |
Snowflake | 1 hour from information schema | 1 hour from internal log | Taken from metadata information every hour | Taken from metadata information every hour | Collected through SQL queries, based on user configuration |
BigQuery | 1 hour from information schema | 1 hour from internal log | Taken from metadata information every hour | Taken from metadata information every hour | Collected through SQL queries, based on user configuration |
Databricks | 1 hour from metastore | 1 hour from internal log | Taken from metadata information every hour | Row-volume with metadata every hour | Collected through queries, based on user configuration. We recommend SQL Warehouses. |
Data Lakes on s3 | 1 hour from metastore (Glue/Hive) | 1 hour from Hive logs on s3/Presto logs on s3/Athena | Collected through SQL queries, based on user configuration. Queries can be executed using Hive/Presto/Athena/Spark | ||
Tableau API | 12 hours from API | ||||
Looker Git | 12 hours from cloud hosted repos | ||||
Looker API | 4 days Note: Our Looker API connection retrieves data every 4 days due to Looker API limits. | ||||
PowerBI | 12 hours from API | ||||
dbt Cloud | 1 hour from API | ||||
dbt Core | No interval - pushed to Monte Carlo when CLI command is run | ||||
Airflow | No interval - pushed to Monte Carlo when DAGs run | Inferred from tags on queries run in warehouse |
Updated 11 days ago