Dremio (public preview)

What is Dremio?

Dremio is a data lakehouse platform that enables fast and efficient querying of large datasets across various data sources without requiring data movement. It combines the flexibility of data lakes with the performance of data warehouses, offering advanced features like in-memory acceleration, semantic layers, and seamless integration with BI tools. Organizations use Dremio to simplify data access, accelerate analytics, and empower users to explore and analyze data directly from their data lakes in real time.

Why Connect Dremio to Monte Carlo?

Integrating Monte Carlo with Dremio allows you to monitor your Dremio data sources through custom SQL monitors, which can be created in either the UI wizard and/or programmatically via monitors as code (API/SDK too). These monitors can be used to generate notifications (incidents) to relevant stakeholders and circuit break pipelines. For a full list of supported monitor types see Monitor Support.

Monitor & Lineage Support

Below are the monitors and lineage support for the Dremio integration today. Please reach out to your Monte Carlo representative if you have requirements for additional Monitors.

CategoryMonitor / Lineage CapabilitiesSupport
Table MonitorFreshness
Table MonitorVolume
Table MonitorSchema Changes
Metric MonitorMetric
Validation MonitorCustom SQL
Validation MonitorComparison
Validation MonitorValidation
Job MonitorQuery performance
LineageLineage

Connecting to Dremio

Use the Monte Carlo CLI to add the Dremio integration. If you have not set up the Monte Carlo CLI before start here.

📘

You will need to use the CLI v0.104.1 or greater

Monte Carlo will use a Personal Access Token to authenticate to Dremio. For steps on how to create a Dremio PAT, see the instructions below for your type of Dremio deployment:

Now that we have a the Monte Carlo CLI installed and Dremio PAT created, we can add the integration.

The structure of the command is:

montecarlo integrations add-dremio [OPTIONS]

To get a full list of options run: montecarlo integrations add-dremio --help

Usage: montecarlo integrations add-dremio [OPTIONS]

  Setup a Dremio integration. For metadata, and custom SQL monitors.

Options:
  --name TEXT          Friendly name for the created integration (e.g.
                       warehouse). Name must be unique.  [required]
  --token TEXT         Token for authentication  [required]
  --host TEXT          Hostname of coordinator node or data.dremio.cloud if
                       using Dremio cloud.  [required]
  --port TEXT          Dremio Arrow Flight server port. 443 if using Dremio
                       Cloud  [required]
  --agent-id UUID      ID for the agent. To disambiguate accounts with
                       multiple agents. This option cannot be used with 'dc-
                       id'.
  --collector-id UUID  ID for the data collector. To disambiguate accounts
                       with multiple collectors. This option cannot be used
                       with 'agent-id'.
  --skip-validation    Skip all connection tests. This option cannot be used
                       with 'validate-only'.
  --validate-only      Run connection tests without adding. This option cannot
                       be used with 'skip-validation'.
  --auto-yes           Skip any interactive approval.
  --option-file FILE   Read configuration from FILE.
  --tls                Use TLS for connection. Required for Dremio cloud.
  --help               Show this message and exit.

Dremio Software:

  • Host: the hostname or IP of your Dremio coordinator node
  • Port: Dremio’s Arrow Flight server port. This will be 32010 unless it was changed on the coordinator node
  • Token: The PAT token created in Dremio

An example command to onboard a Dremio Software connection would be:

montecarlo integrations add-dremio \
	--name Acme-Co-Dremio-Software \
	--token <mytoken> \
	--host <hostname of coordinator node> \
	--port 32010 \  # Dremio's Arrow Flight server port
	--agent-id <UUID of MC Agent>

Dremio Cloud:

  • Host: data.dremio.cloud
  • Port: 443
  • TLS: To connect to Dremio Cloud you must use the —tls flag which will enable TLS for the Monte Carlo to Dremio connection.
  • Token: The PAT token created in Dremio

An example command to onboard a Dremio Cloud connection would be:

montecarlo integrations add-dremio \
	--name Acme-Co-Dremio-Cloud \
	--token <mytoken> \
	--host data.dremio.cloud \  # required
	--port 443 \  # required
	--tls \  # required
	--agent-id <UUID of MC Agent>

FAQs

For Dremio Cloud, which Sonar project is monitored if I have multiple?

We will monitor the Default sonar project. Instructions on how to set the Default sonar project can be found here.