Information Monte Carlo Collects

The following information may be processed and stored by Monte Carlo:

Data Type

Details

Purpose

Stored on

Metadata

Information about tables, schemas, data freshness and volume, names and attributes of BI reports/dashboards, and other such attributes of customer data assets. These data are collected directly from customer warehouses, lakes, and BI tools via APIs, JDBC connections, and other methods.

Build a catalog of warehouse, lake and BI objects along with schema information in order for Monte Carlo to provide the Data + AI observability reports and services.

Cloud service

Metrics

Row counts, byte counts, last modification date and other similar table-level metrics. These data are collected directly from customer warehouses, lakes, and BI tools via APIs, JDBC connections, and other methods.

Track freshness, volume and other aspects of data health and distribution.

Cloud service

Query logs

History of queries, as well as metadata about them (timestamp, user performing the query, errors if any, etc). These data are collected directly from customer warehouses, lakes, and BI tools via APIs, JDBC connections, and other methods.

Track lineage, usage analytics and query history to help with troubleshooting and prevention use cases.

Cloud service

Aggregated statistics

Aggregated statistical measures of the data in selected tables. Statistics may include null rates, distinct values, row counts, percentiles, and other similar metrics. These data are collected directly from customer warehouses, lakes, and BI tools via APIs, JDBC connections, and other methods.

Track data health and corruption using ML-based anomaly detection as well as customer-provided rules.

Cloud service

Application Data

Customer accounts, user settings, configurations, IP address, incidents, and other elements necessary to set up the Service.

This information is generated as users sign up and interact with the Service and for user authentication.

Cloud service

Data Sampling

A sample set of individual values or data records from the customer tenant in clear text form that are associated with a data reliability incident detected by Monte Carlo.

For more information on data sampling, see here.

Help users quickly identify the nature of data issues and their root cause.

Data store