Table

A table’s importance score is based on the following parameter’s:

The number of reads
The number of users (query executors)
The degree connectivity (i.e table’s # of parent/child tables)
The update periodicity
Age & freshness

The score is only calculated for tables that are specified in the warehouse’s metadata AND have at least one read or write query in the last 30 days (if a table was not queried in the last 30 days it can be found in the “Cleanup Suggestions” insight). Moreover, if the table is stale, newly created, or (very) rarely used then its score is set to zero.

We then divide the table/views (whose score ≠ 0) into 3 scoring groups:

Source tables (tables that have no parent table)
ETL tables (tables that have both parent and child tables)
Analytic tables (tables that have a parent but no children and are being queried by users/BI tools)

For each of these scoring groups (independently) a score is then generates for each scoring parameter (reads, users, connectivity, periodicity). Then, for each table, these scores are combined into a single importance score which ranges between 0 and 1.

How is the update importance score calculation logic.in details?

The score is automatically set to zero if the table was not queried in the last 14 days or was queried for less than 3 distinct days out of the last 30 days.
The score is an absolute score - each table (or view) is scored individually by a “scoring group” (source tables, ETL tables, analytic tables) based on the following scoring parameters(reads, users, degree of connectivity, periodicity).

Reads score - If the avg. number of reads (in the last 30 days) < 1 then the score is set to zero. Otherwise, the number of reads is transformed to a percentile (0-1).
Users score - If the number of query executors < 3 then the score is set to zero. Otherwise, the number of users is transformed to a percentile (0-1).
Degree score - If the table degree (number of direct parent or child tables) < 3 then the score is set to zero. Otherwise, the degree score is set to 0.5.
Periodicity score - If the table (or a view) is periodic in terms of updates then the score is set to 1.2, and 0 otherwise. As views don’t get “updates” (views are not being written to), we define a “periodic” view as a view which is connected to a periodic table.

3 of the 4 parameters mentioned above (Reads, Users, and degree) are grouped together to give one single score. The periodicity score parameter is kept solo and these two variables (periodicity & the group) are then given equal weight to calculate the end score. This end score is then used to calculate if a table is important, where a score of ≥ 0.6 indicates importance.

importance_score = w ( periodicity score) + (1-w) ( group score).

Field

A field’s importance score is based on the following parameters:

The number of explicit (I.E select field from table) reads by transformation queries
The number of implicit (I.E select * from table) reads by transformation queries
The number of ad-hoc queries by users that used this field whether implicit or explicit
Number of (used) dashboards that use this field

All parameters are calculated over the last 7 days and sum over up to 4 layers downstream, meaning if a field is read by 1 table but then the field is read from that table by 10 other tables the number of reads would be 11, this will be done up to 4 layers downstream.

This creates the preliminary score using the formula: 10 X num_dashboards + 5 X num_explicit_reads + 2 X num_implicit_reads + log2(num_adhoc_queries)

We then take these scores and normalize them across the entire warehouse using min max scaling to create a score between 0-1. We call this the global importance score.

For each table we then min max scale again using only the fields in that table to create the local importance score.

A field is deemed important if its global score >= 0.5 or its local score >= 0.8

The only way a field can have 0 importance and be suggested for cleanup is if that field was not used in the last week by any transformation query, ad-hoc read or dashboard.

Looker Dashboard

A Dashboard's importance score is based on the following parameters:

Number of usages recently
Days since last used
Days since created

There are different weights to these parameters to score each Dashboard to a score between 0 and 1. Dashboards are considered Important / Key Assets at a score of 0.8 or greater.

Databricks

If query logs collections are enabled (Unity Catalog required), importance scores in Databricks are the same as other warehouses. Without query logs, Databricks assets have an importance score calculation that is based on the following parameter:

The number of freshness events in the last 30 days

If a Databricks asset has more than 5 freshness events in the last 30 days, it is given an importance score of 0.9 and is considered an Important / Key Asset. Otherwise the asset is given an importance score of 0.0.