Migrating the Databricks Metadata Job to a Job Cluster

  1. In your Databricks Workspace, go to the Workflows pane.
  2. Click the Monte Carlo Job - it should be prefixed with monte-carlo-metadata-collection
  1. On the right pane, under Compute, click the Swap button.
  2. In the pop-up box, click New job cluster.
  1. Create the job cluster with your desired settings. If you prefer the most inexpensive recommendations, see below. If the job is taking too long at these settings, we can up the cluster size or number of workers.

    1. Nodes: Single node

    2. Node type: i4i.large (cheapest AWS cluster available)

    3. Important 🚨: Under Advanced options, on the Spark tab, append these two arguments under Spark config:

      spark.databricks.clusterSource API
      spark.databricks.hive.metastore.client.pool.size 40
      spark.databricks.isv.product MonteCarlo+ObservabilityPlatform
  1. Click Confirm and then Update.
  2. Under Permissions, make sure the Service Principal or User (for Personal Access Token) is selected as Is Owner.
  3. Verify that the job is able to run with the permissions available.
  4. Run the following command with the montecarlo cli
    1. montecarlo integrations update --connection-id <connection_id> --changes '{"uses_job_cluster": true}'