PII Filtering

PII (Personal Identifiable Information) Filtering provides a way to filter sensitive data before it is stored in Monte Carlo's servers.

Filtering is performed by redacting data upon collection by or submission to Monte Carlo. Our integrations will match private or sensitive data and replace them with a placeholder text indicating it was filtered.

How is data identified as sensitive?

Monte Carlo supports a set of standard rules defined through regular expressions to identify sensitive data; if a given rule is enabled for the account and there’s a text matching it, the matching fragment will be replaced by the following text “<filtered:filter_name>” where filter_name will be the rule name, for example "email_address".

How is data redacted?

Query Logs

Let’s suppose the following SQL Query was executed on your warehouse:

SELECT * FROM users WHERE email=‘[email protected]'

Monte Carlo shows that query under Query Logs for table “users”. If PII Filtering is enabled for your account and email_address rule is enabled, the Query Log in Monte Carlo will be displayed like this:

SELECT * FROM users WHERE email=‘<filtered:email_address>’

This filter is applied inside the Data Collector component (that might be running in your infrastructure) which means sensitive data will never reach Monte Carlo infrastructure.

Field Values

The filters are also applied to values we collect in custom monitors. For example, if you have metric or field-health monitors on a field which contains content that matches the rule, we will redact it. The monitor will not work as intended, but the PII will be safeguarded.

When is data redacted?

PII filtering follows the same core principles across all deployment types, but the location where redaction occurs depends on your specific deployment setup.

Not sure what deployment type you are using? Check out our FAQ.

Cloud Deployments

In cloud deployments, data is encrypted and transmitted to the Monte Carlo cloud. Once received, it is held temporarily in memory and redacted according to your defined PII rules.

Hybrid Deployments

For hybrid deployments, data is encrypted by your Agent before being sent to the Monte Carlo cloud. Similar to cloud deployments, the data is held in memory and redacted based on your PII rules.

Legacy Deployments

In legacy deployments—where you host your own data collector—data remains in memory on the self-hosted collector and is redacted locally, according to the PII rules. Once redacted, the data is encrypted and transmitted to the Monte Carlo cloud.

Out of the Box PII Rules

The following rules are enabled by default, this means that if you enable PII Filtering for your account as described in the next section, all rules listed below will be applied to your data automatically:

E-mail Address
US Social Security Number

Please note you can use getPiiFilters query in the API to get the list of all rules in the system and if they are enabled or not for your account. Note: there is not currently a mechanism to extend out further rules -- please contact us to discuss options.

Failure Mode

Monte Carlo allows you to configure what happens if an error occurs during the data filtering process. If the data is sensitive, you might prefer to ignore it and leave data without being redacted or stop the data processing in the Data Collector.
This behavior is controlled by the Fail Mode setting that has two available values:

OPEN (the default value): means data processing will not be aborted if there is an error during the PII Filtering process, an error message will be logged and the data processing will continue.
CLOSE: means data processing will stop if an error occurs during the PII Filtering process, preventing data that couldn't be checked to leave the Data Collector.

Enabling PII Filtering

PII Filtering can be enabled in two ways:

CLI

You can use Monte Carlo CLI to enable PII Filtering for your account with the following command:

montecarlo management configure-pii-filtering --enable

You can also use CLI to change the value for the Fail Mode setting:

montecarlo management configure-pii-filtering --enable --fail-mode CLOSE

And you can check your PII Filtering settings with the get-pii-preferences option:

montecarlo management get-pii-preferences

API

Additionally, you can use Monte Carlo API for enabling/disabling PII Filtering for your account, through the updatePiiFilteringPreferences mutation like in the following example:

mutation set_pii_prefs {
  updatePiiFilteringPreferences(enabled:true, failMode:CLOSE) {
    success
  }
}

{
  "data": {
    "updatePiiFilteringPreferences": {
      "success": true
    }
  }
}

Enabling/Disabling individual PII Rules

As mentioned before, when PII Filtering is enabled for an account all rules enabled by default will be enabled for the account, this can be tweaked later by enabling/disabling PII Rules individually.
This can be performed using the API through the setPiiFilterStatus mutation passing a list of PiiFilterStatusPair objects.

The following example enables the filtering for e-mail addresses (email_address rule) and disables it for Social Security Numbers (us_ssn rule):

mutation set_pii_filters {
  setPiiFilterStatus(piiFilterStatusPairs: [
    {
      filterName: "email_address",
      enabled: true
    },
    {
      filterName: "us_ssn",
      enabled: false
    }
  ])
  {
    success
  }
}

{
  "data": {
    "setPiiFilterStatus": {
      "success": true
    }
  }
}

You can also use the API to get the filtering preferences:

query getPiiFilteringPreferences {
  getPiiFilteringPreferences {
    enabled,
    failMode
  }
}

{
  "data": {
    "getPiiFilteringPreferences": {
      "enabled": true,
      "failMode": "CLOSE"
    }
  }
}

and the status for each filter:

query getPiiFilters {
  getPiiFilters {
    name,
    pattern,
    enabled
  }
}

{
  "data": {
    "getPiiFilters": [
      {
        "name": "us_ssn",
        "pattern": "\\d{3}-\\d{2}-\\d{4}",
        "enabled": false
      },
      {
        "name": "email_address",
        "pattern": "[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+",
        "enabled": true
      }
    ]
  }
}