Airflow
With the Monte Carlo Airflow Integration, you can be alerted of failures, quickly determine what Airflow DAG potentially caused data-level issues, and control Airflow with Rules and Circuit Breakers. Incident investigation and resolution can happen faster than ever when you can get visibility between Airflow and your Data Warehouse in a single pane of glass.
Setup and Getting Started
- Create the Airflow integration at https://getmontecarlo.com/settings/integrations
- Set up Airflow Lineage with the Airflow in Lineage annotation setup documentation.
- Set up Airflow Alerts and Tasks with the Airflow alerts and Task Observability callback setup documentation (strongly recommended)
- Use Circuit Breakers to control Airflow based on Monte Carlo rules with the Circuit breakers setup documentation (optional)
Airflow DAGs and Tasks in Monte Carlo Lineage
Understand which Airflow Task creates the lineage between two tables in your Data Warehouse. Click into the Airflow icon to see recent DAG & Task Runs for this lineage edge.
To setup Airflow Lineage, follow the Airflow in Lineage setup documentation.
Airflow DAG and Task Observability - Alerts
Use Monte Carlo as a single pane of glass for all data quality context for each table, including Airflow DAG/task runs. Check the status and duration of recent Airflow DAGs & Tasks relevant to a table of interest.
To setup DAG & Task Run Tracking, follow the Airflow DAG & Task Observability setup documentation. It's strongly recommended to also set up the Airflow in Lineage integration in order to properly correlate DAGs with Tables.
Monte Carlo Notifications of Airflow Failure Alerts
Monte Carlo allows you to surface Airflow failures and errors as Monte Carlo alerts. Among other things, this will enable you to:
- Route and receive notifications similar to other Monte Carlo alerts
- Analyze the downstream impact of those alerts
- Create holistic incident reporting and tooling for all data issues
To set this up:
- Create an audience that includes an "Other Notification"
- Select Airflow Job Failures as the alert type
- Under "Affected Data", select "Databases, schemas, tables, jobs, and tags" and add the Airflow DAGs of interest to include in this audience.
If you have a domain that already selected the desired Airflow DAGs, you can also select that domain as "Affected Data" to include alerts for those DAGs under the audience.
If "All" is selected as "affected data" for an audience, alerts for all Airflow DAGs will be included in this audience.
Monte Carlo Airflow Operators for Rules & Circuit Breakers
Run Monte Carlo rules or "circuit break” pipelines when data does not meet a set of quality or integrity thresholds. This can be useful for multiple purposes including, but definitely not limited to, checking if data does not meet your requirements between transformation steps, or after ETL/ELT jobs execute, but before BI dashboards are updated.
To setup Airflow Operators or Circuit Breakers, follow the Circuit breakers setup documentation.
FAQs
Why don't I have to give my Airflow credentials, as with other integrations?
Monte Carlo observes Airflow through existing query log integrations and when callbacks from Airflow report status to Monte Carlo. Monte Carlo does not reach out to gather Airflow information
Are multiple airflow instances supported?
Yes! See here for information on how to support lineage for multiple airflow environments https://docs.getmontecarlo.com/docs/airflow-in-lineage#multiple-airflow-environments
Which Airflow providers are supported?
All providers are supported, including MWAA (AWS), Cloud Composer (GCP) and Astronomer, as well as self-managed installations.
How long does it take for Airflow data to show up in Monte Carlo?
DAG and Task run data will be immediately available in the Asset Page and Incidents. However, Airflow data in Lineage and the Catalog may have delays of up to 24 hours due to batch processing.
Does Monte Carlo support alerting or observability for SLA misses?
Currently, Monte Carlo does not support alerting or observability for SLA misses. If you are interested in having this functionality, please let your Monte Carlo representative know, or contact [email protected].
Is something missing? Request more functionality for the Monte Carlo Airflow Integration here.
Updated 16 days ago