Databricks Workflows (public preview)
Databricks Workflows is Databricks' managed orchestration services, which enables users to easily create and manage multitask workflows for ETL, analytics and machine learning pipelines.
Integrating MC with Databricks Workflows allows you to quickly determine which Databricks workflow potentially caused an anomaly downstream, accelerating your time to resolution. You will also be able to manage your workflow failure alerts along with all other data quality alerts in Monte Carlo, so you have centralized incident triage, notification routing, and data quality reporting across all your data and system issues.
1. Databricks Workflows on Lineage and as Assets
Databricks Job on Lineage
Understand which Databricks Workflows jobs creates the lineage between two tables in your Databricks environment.
Click into the Databricks icon to see recent job runs for this lineage edge.
Databricks Jobs As Assets
Use Monte Carlo as a single pane of glass for all data quality context for each table, including Databricks Job runs. Check the status and duration of recent Databricks job runs relevant to a table of interest.
Setup and Getting Started
If you have set up the MC-Databricks integration after 10/1/2024 following the latest documentation, there should be no additional configuration needed. If not, you need to ensure the following two conditions:
i. Have a working MC-Databricks connection, following instructions step 1 through step 6 here, and
ii. Make sure permissions to job-related system tables below are granted to the Service Principal. These are also covered in docs here:
GRANT SELECT ON system.lakeflow.jobs TO <monte_carlo_service_principal>;
GRANT SELECT ON system.lakeflow.job_tasks TO <monte_carlo_service_principal>;
GRANT SELECT ON system.lakeflow.job_run_timeline TO <monte_carlo_service_principal>;
GRANT SELECT ON system.lakeflow.job_task_run_timeline TO <monte_carlo_service_principal>;
Then Monte Carlo will automatically start collecting Databricks Workflow Job information.
2. Databricks Workflows Failures as MC Alerts
Databricks Job Alerts in MC
Receive alerts on Databricks Workflows failures in Monte Carlo, so you can centrally manage and report on such issues along with all your other data incidents. More info on how alerts work in MC here
Setup
1. Create Webhook Key and Secret in Monte Carlo
Create via UI
You can use the Monte Carlo UI to create a databricks webhook integration key. On the Integrations page, under the Orchestration section, you will see a Databricks Workflows connection for each Databricks Sql Warehouse Connection in Monte Carlo.
Click on the three dots at the end of the line, select Manage Keys to create a key. You will use this key to create the webhook notification in Databricks.
Click on the "Generate integration key" button, then you will see an integration key generated below.
Create via CLI
Alternatively, you can use the Monte Carlo CLI to create a databricks webhook integration key. Follow script below.
% montecarlo integrations create-databricks-webhook-key --help
Usage: montecarlo integrations create-databricks-webhook-key
[OPTIONS]
Create an integration key for a Databricks webhook
Options:
--integration-name TEXT Name of associated Databricks metastore integration
(required if you have more than one)
--option-file FILE Read configuration from FILE.
--help Show this message and exit.
2. Add Webhook notification in Databricks
Note: This must be done by a workspace admin in Databricks.
Create a notification destination pointing to the Monte Carlo Databricks Webhook endpoint below.
https://integrations.getmontecarlo.com/webhooks/databricks
Go to Databricks UI, under Settings, navigate to tab Notifications, click on "Manage" to create a new notification destination.
In the Create New Destination screen, select type "Webhook", give it a name, and copy the webhook endpoint URL from above in the form.
For more details on creating notification destination in Databricks, see Databricks Documentation .
3. Add Notifications and Permissions to Databricks Jobs
Next you will set up the notification and permission for each Databricks job you want alerts routed to MC for. You can do it via the Databricks UI for each job one by one, or you can use a script provided below to configure all jobs in bulk. We recommend configuring for all jobs in bulk via script given i) it is fast to set up, ii) you can still enable or disable alert generation for each job in MC settings per step 4 below.
via Databricks UI
For each job you want to send failures to Monte Carlo for, go to that job's page in Databricks UI. On the right-hand-side panel under Job Notifications, click on "Edit notifications". Add a new notification, add the Webhook destination you created in the previous step, and check Failure then save the notification.
As part of the failure alerting workflow, Monte Carlo will use the Databricks Jobs API to gather more information to provide context for the alert, which requires the Can View permission for those jobs. To grant that permission, go back to the job's page in Databricks and navigate to the Permissions section on the right hand side, click on "edit permission".
In the Permission Settings screen, grant your Monte Carlo service principal the Can View permission.
via Script (bulk option available) [Recommended]
Get the UUID of the Monte Carlo notification destination you created before via Databricks UI -> Settings -> Notifications. Click on the copy button next to "Edit destination settings" to copy destination ID as the UUID.
Then get the Application ID of the Service Principal used to allow Monte Carlo Access. You can find this at https://<your-databricks-workspace>/settings/workspace/identity-and-access/service-principals
You will use the notification destination UUID and the Application ID for MC Service Principal in the script located here
You can choose to enable alerts for all jobs, or specify which jobs to enable alerts for by using the --databricks-job-name
option.
Usage: enable_monte_carlo_databricks_job_incidents.py [OPTIONS]
Enable Monte Carlo incidents for Databricks jobs
Options:
--mcd-notification-id TEXT UUID of the existing Databricks Notification
pointing to the MC Webhook endpoint.
[required]
--mcd-service-principal-name TEXT
Application ID of the existing Monte Carlo
service principal in Databricks. [required]
--databricks-job-name TEXT Databricks Job Name to enable MC incidents
for. Can be used multiple times. If not
specified, enable MC incidents for all jobs.
--help Show this message and exit.
Example:
python add_monte_carlo_webhook_notifications.py --mc-notification-id '<mc_webhook_uuid>' --job-name 'Dashboard Job' --job-name 'Job 2'
4. Enable Alerts for desired jobs in Monte Carlo
On the same connection where you created the Webhook Key, under the three dot menu, select the Configure Jobs
option.
Here you can choose which jobs to generate alerts for. By default, alerts for jobs are disabled. To bulk enable/disable jobs, click on the checkboxes on the left for all jobs you want to alert on, then click on the "Alerts" button highlighted in blue, then you can either enable or disable alerts for all selected jobs.
5. Set up audience
Finally, make sure to set up an audience in MC to route your Databricks workflow failures. You can do that by going to MC UI, Settings -> Notifications, then either create a new audience or edit an existing audience to include your Databricks workflow failures.
You can send failures for all Databricks Workflows to an audience by selecting Databricks job failures in Alert Type and All for Affected Data.
Or you can route only a subset of Databricks jobs for an audience by selecting Databases, schemas, tables, jobs, and tags for Affected Data, then adding the jobs you want to include.
For more information on MC Audiences, see Audiences
Now you should start receiving Databricks workflow failure alerts in the desired audience!
Updated 12 days ago