Spark
To connect Monte Carlo to a Spark cluster for health queries, follow these steps:
- Create a Spark cluster for Monte Carlo. Alternatively, you may use an existing cluster that you already have in your environment. Ensure that auto-termination is off.
- Ensure that the Spark Thrift Server is enabled. Different distributions handle this server differently, for example, it is enabled by default on Databricks but needs to be enabled explicitly on EMR.
- Ensure that Monte Carlo's data collector has network connectivity to the cluster (VPC peering might be required).
- Create a service account on your Spark cluster (if necessary).
- Provide service account credentials to Monte Carlo as described below.
Adding a Databricks connection
- A more convenient way to add a connection to an all-purpose cluster is through the integrations wizard in the Monte Carlo application.
- The minimum supported Databricks version is 7.x.
Providing account credentials to Monte Carlo
You will provide connection details for Spark using Monte Carlo's CLI:
- Please follow this guide to install and configure the CLI.
- Monte Carlo supports three types of connections: binary, HTTP and Databricks. Please use the command
montecarlo integrations [add-spark-binary-mode | add-spark-http-mode | add-spark-databricks]
to set up Spark connectivity. For reference, see help for the commands below.
$ montecarlo integrations add-spark-binary-mode --help
Usage: montecarlo integrations add-spark-binary-mode [OPTIONS]
Setup a thrift binary Spark integration. For health queries.
Options:
--host TEXT Hostname. [required]
--database TEXT Name of database. [required]
--port INTEGER Port. [default: 10000]
--user TEXT Username with access to spark. [required]
--password TEXT User\'s password. If you prefer a prompt (with hidden
input) enter -1. [required]
--name TEXT Friendly name of the warehouse which the connection
will belong to.
--collector-id UUID ID for the data collector. To disambiguate accounts
with multiple collectors.
--skip-validation Skip all connection tests. This option cannot be used
with 'validate-only'.
--validate-only Run connection tests without adding. This option cannot
be used with 'skip-validation'.
--auto-yes Skip any interactive approval.
--option-file FILE Read configuration from FILE.
--help Show this message and exit.
$ montecarlo integrations add-spark-http-mode --help
Usage: montecarlo integrations add-spark-http-mode [OPTIONS]
Setup a thrift HTTP Spark integration. For health queries.
Options:
--url TEXT HTTP URL. [required]
--user TEXT Username with access to spark. [required]
--password TEXT User\'s password. If you prefer a prompt (with hidden
input) enter -1. [required]
--name TEXT Friendly name of the warehouse which the connection
will belong to.
--collector-id UUID ID for the data collector. To disambiguate accounts
with multiple collectors.
--skip-validation Skip all connection tests. This option cannot be used
with 'validate-only'.
--validate-only Run connection tests without adding. This option cannot
be used with 'skip-validation'.
--auto-yes Skip any interactive approval.
--option-file FILE Read configuration from FILE.
--help Show this message and exit.
$ montecarlo integrations add-spark-databricks --help
Usage: montecarlo integrations add-spark-databricks [OPTIONS]
Setup a Spark integration for Databricks. For health queries.
Options:
--databricks-workspace-url TEXT
Databricks workspace URL. [required]
--databricks-workspace-id TEXT Databricks workspace ID. [required]
--databricks-cluster-id TEXT Databricks cluster ID. [required]
--databricks-token TEXT Databricks access token. If you prefer a
prompt (with hidden input) enter -1.
[required]
--name TEXT Friendly name of the warehouse which the connection
will belong to.
--collector-id UUID ID for the data collector. To disambiguate
accounts with multiple collectors.
--skip-validation Skip all connection tests. This option
cannot be used with 'validate-only'.
--validate-only Run connection tests without adding. This
option cannot be used with 'skip-
validation'.
--auto-yes Skip any interactive approval.
--option-file FILE Read configuration from FILE.
--help Show this message and exit.
Updated about 1 year ago