Data Collection: Details per Integration

See below for information regarding where & how Monte Carlo collects data per integration

IntegrationMetadataQuery LogsFreshnessVolumeAdvanced Monitors
Redshift1 hour from information schema10 minutes from internal logCalculated from query logs, based on queries that are deemed to update tablesTaken from metadata information every hourCollected through SQL queries, based on user configuration
Snowflake1 hour from information schema1 hour from internal logTaken from metadata information every hourTaken from metadata information every hourCollected through SQL queries, based on user configuration
BigQuery1 hour from information schema1 hour from internal logTaken from metadata information every hourTaken from metadata information every hourCollected through SQL queries, based on user configuration
Databricks1 hour from metastore1 hour from internal logTaken from metadata information every hourRow-volume with metadata every hourCollected through queries, based on user configuration. We recommend SQL Warehouses.
Data Lakes on s31 hour from metastore (Glue/Hive)1 hour from Hive logs on s3/Presto logs on s3/AthenaCollected through SQL queries, based on user configuration. Queries can be executed using Hive/Presto/Athena/Spark
Tableau API12 hours from API
Looker Git12 hours from cloud hosted repos
Looker API4 days
Note: Our Looker API connection retrieves data every 4 days due to Looker API limits.
PowerBI12 hours from API
dbt Cloud1 hour from API
dbt CoreNo interval - pushed to Monte Carlo when CLI command is run
AirflowNo interval - pushed to Monte Carlo when DAGs runInferred from tags on queries run in warehouse