Available Metrics
The following metrics are supported by Metric Monitors. Users can also define custom metrics using SQL. The metric names for Monitors as Code (MaC) config are listed in the MaC Name
column for each metric.
Uniqueness
Uniqueness metrics check for duplicates in unique keys like UUIDs, and for changes in cardinality.
Metric | Description | Column Types | Threshold Types | MaC Name |
---|---|---|---|---|
Unique (%) | Percentage of unique values across all rows | All | Manual & ML | UNIQUE_RATE |
Unique (count) | Count of unique values | All | Manual | UNIQUE_COUNT |
Duplicate (count) | Count of duplicate values, calculated as the difference between the total number of rows with non-null values and total number of distinct non-null values | All | Manual | DUPLICATE_COUNT |
Completeness
Completeness metrics check for ways that data can be null or otherwise unpopulated.
Metric | Description | Column Types | Threshold Types | MaC Name |
---|---|---|---|---|
Null (%) | Percentage of rows where value is null | All | Manual & ML | NULL_RATE |
Null (count) | Count of rows with null value | All | Manual | NULL_COUNT |
Non-null (count) | Count of rows with non-null value | All | Manual | NON_NULL_COUNT |
Empty string (%) | Percentage of rows where the value is an empty string | String | Manual & ML | EMPTY_STRING_RATE |
Empty string (count) | Count of rows where the value is an empty string | String | Manual | EMPTY_STRING_COUNT |
All spaces (%) | Percentage of rows where the text value is whitespace only | String | Manual & ML | TEXT_ALL_SPACE_RATE |
All spaces (count) | Count of rows where the text value is whitespace only | String | Manual | TEXT_ALL_SPACE_COUNT |
NaN (%) | Percentage of rows with a value of NaN (Not a Number), meaning the value is undefined | Numeric | Manual & ML | NAN_RATE |
NaN (count) | Count of rows with a value of NaN (Not a Number), meaning the value is undefined | Numeric | Manual | NAN_COUNT |
"none" or "null" (%) | Percentage of rows where the text value is a null keyword ("none", "null", "nil", "nothing" or "n/a") | String | Manual & ML | TEXT_NULL_KEYWORD_RATE |
"none" or "null" (count) | Count of rows where the text value is a null keyword ("none", "null", "nil", "nothing" or "n/a") | String | Manual | TEXT_NULL_KEYWORD_COUNT |
Distribution
Distribution metrics check for shifts in the numeric profile of data.
Metric | Description | Column Types | Threshold Types | MaC Name |
---|---|---|---|---|
Mean | Average value across all rows | Numeric | Manual & ML | NUMERIC_MEAN |
Median | Median value across all rows | Numeric | Manual & ML | NUMERIC_MEDIAN |
Min | Minimum value across all rows | Numeric | Manual & ML | NUMERIC_MIN |
Max | Maximum value across all rows | Numeric | Manual & ML | NUMERIC_MAX |
20th percentile | 20th percentile of values | Numeric | Manual & ML | PERCENTILE_20 |
40th percentile | 40th percentile of values | Numeric | Manual & ML | PERCENTILE_40 |
60th percentile | 60th percentile of values | Numeric | Manual & ML | PERCENTILE_60 |
80th percentile | 80th percentile of values | Numeric | Manual & ML | PERCENTILE_80 |
Zero (%) | Percentage rows with value equal to zero | Numeric | Manual & ML | ZERO_RATE |
Zero (count) | Count of rows with value equal to zero | Numeric | Manual | ZERO_COUNT |
Negative (%) | Percentage of rows that have a negative value | Numeric | Manual & ML | NEGATIVE_RATE |
Negative (count) | Count of rows that have a negative value | Numeric | Manual | NEGATIVE_COUNT |
Standard deviation | Standard deviation of values | Numeric | Manual & ML | NUMERIC_STDDEV |
Sum | Sum of values across all rows | Numeric | Manual | SUM |
True (%) | Percentage of rows where the value is true | Boolean | Manual & ML | TRUE_RATE |
True (count) | Count of rows where the value is true | Boolean | Manual | TRUE_COUNT |
False (%) | Percentage of rows where the value is false | Boolean | Manual & ML | FALSE_RATE |
False (count) | Count of rows where the value is false | Boolean | Manual | FALSE_COUNT |
Validity
Validity metrics check that values are honoring expected and usable formats, including common data entry errors.
Metric | Description | Column Types | Threshold Types | MaC Name |
---|---|---|---|---|
String length max | Maximum character length | String | Manual | TEXT_MAX_LENGTH |
String length min | Minimum character length | String | Manual | TEXT_MIN_LENGTH |
String length mean | Average character length | String | Manual | TEXT_MEAN_LENGTH |
String length standard deviation | Standard deviation of character length | String | Manual | TEXT_STD_LENGTH |
Integer (%) | Percentage of rows where the text value is an integer | String | Manual & ML | TEXT_INT_RATE |
Not integer (count) | Count of rows where the text value is not an integer | String | Manual | TEXT_NOT_INT_COUNT |
Float (%) | Percentage of rows where the text value is a floating-point number | String | Manual & ML | TEXT_NUMBER_RATE |
Not float (count) | Count of rows where the text value is not a floating-point number | String | Manual | TEXT_NOT_NUMBER_COUNT |
UUID (%) | Percentage of rows where the text value is a UUID (e.g., b391e7d2-80e0-4749-8c60-c76031c43dfe). The comparison is case insensitive. | String | Manual & ML | TEXT_UUID_RATE |
Not UUID (count) | Count of rows where the text value is not a UUID (e.g., b391e7d2-80e0-4749-8c60-c76031c43dfe). The comparison is case insensitive. | String | Manual | TEXT_NOT_UUID_COUNT |
SSN (%) | Percentage of rows where the text value is formatted as Social Security Number from the United States (e.g., 123-45-6789) | String | Manual & ML | TEXT_SSN_RATE |
Not SSN (count) | Count of rows where the text value is not formatted as Social Security Number from the United States (e.g., 123-45-6789) | String | Manual | TEXT_NOT_SSN_COUNT |
USA phone number (%) | Percentage of rows where the text value is a USA phone number (e.g., (123) 456-7890, +1 (123) 456-7890 etc) | String | Manual & ML | TEXT_US_PHONE_RATE |
Not USA phone number (count) | Count of rows where the text value is not a USA phone number (e.g., (123) 456-7890, +1 (123) 456-7890 etc) | String | Manual | TEXT_NOT_US_PHONE_COUNT |
USA state code (%) | Percentage of rows where the text value is a USA state code (e.g., TX) | String | Manual & ML | TEXT_US_STATE_CODE_RATE |
Not USA state code (count) | Count of rows where the text value is not a USA state code (e.g., TX) | String | Manual | TEXT_NOT_US_STATE_CODE_COUNT |
USA ZIP code (%) | Percentage of rows where the text value is a USA ZIP code (e.g., 94109) | String | Manual & ML | TEXT_US_ZIP_CODE_RATE |
Not USA ZIP code (count) | Count of rows where the text value is not a USA ZIP code (e.g., 94109) | String | Manual | TEXT_NOT_US_ZIP_CODE_COUNT |
Email (%) | Percentage of rows where the text value is an email address (e.g., [email protected]) | String | Manual & ML | TEXT_EMAIL_ADDRESS_RATE |
Not email (Count) | Count of rows where the text value is not an email address (e.g., [email protected]) | String | Manual | TEXT_NOT_EMAIL_ADDRESS_COUNT |
Timestamp (%) | Percentage of rows where the text value is an ISO-8601 format date or timestamp (e.g., 2023-01-12 16:50:11.045746 +00:00, 2023-01-12T16:50:11Z) | String | Manual & ML | TEXT_TIMESTAMP_RATE |
Not timestamp (count) | Count of rows where the text value is not an ISO-8601 format date or timestamp (e.g., 2023-01-12 16:50:11.045746 +00:00, 2023-01-12T16:50:11Z) | String | Manual | TEXT_NOT_TIMESTAMP_COUNT |
In past (%) | Percentage of rows where the value is a date or time occurring before the time when the metric is evaluated, with a granularity of seconds used for comparison | Date | Manual & ML | PAST_TIMESTAMP_RATE |
In past (count) | Count of rows where the value is a date or time occurring before the time when the metric is evaluated, with a granularity of seconds used for comparison | Date | Manual | PAST_TIMESTAMP_COUNT |
In future (%) | Percentage of rows where the value is a date or time occurring after the time when the metric is evaluated, with a granularity of seconds used for comparison | Date | Manual & ML | FUTURE_TIMESTAMP_RATE |
In future (count) | Count of rows where the value is a date or time occurring after the time when the metric is evaluated, with a granularity of seconds used for comparison | Date | Manual | FUTURE_TIMESTAMP_COUNT |
Unix time 0 (%) | Percentage of rows where the value is Unix time 0 | Date | Manual & ML | UNIX_ZERO_RATE |
Unix time 0 (count) | Count of rows where the value is Unix time 0 | Date | Manual | UNIX_ZERO_COUNT |
Custom
Users can define custom metrics using SQL. This is helpful to create a metric that:
- Isn't available above, such as
APPROX_PERCENTILE(order_amount, 0.99)
- References multiple fields at once, such as
AVG(price + tax_collected)
COUNT(DISTINCT(CONCAT(invoice_number, line_item)))/COUNT(*)
To create a custom metric, select Custom in the metric dropdown in the Define alert conditions section of a Metric monitor. Currently, only manual thresholds are supported for custom metrics.
Updated 6 days ago