BigQuery
Admin credentials required
To make the read only service user for Monte Carlo, you will need owner permissions on BigQuery.
This guide explains how to create a read-only service account for Monte Carlo on BigQuery.
To review all steps necessary to integrate a data warehouse with Monte Carlo, please see here.
Creating a service account for a single BigQuery project
First, create a role for Monte Carlo's service account:
- Under IAM & Admin, go to the Roles section in your Google Cloud Platform console.
- Select the project to which your BigQuery warehouse belongs using the combo box on the top left of your dashboard.
- Click the Create Role button at the top of the tab.
- Give the new role a name. We recommend "Data Reliability Monitor".
- Change the Role launch stage to "General Availability".
- Click Add Permissions and add the permissions specified below to the role. To make the process faster, consider filtering the permission list by the role BigQuery Admin.
Now, create a service account:
- Under IAM & Admin, go to the Service Accounts section in your Google Cloud Platform console.
- Click the Create Service Account button at the top of the tab.
- Give the account a name and continue. We recommend naming the account "monte-carlo".
- Assign the role you previously created to the service account and continue.
- Click the Create Key button, select JSON as the type and click Create. A JSON file will download β please keep it safe as it grants access to your BigQuery data.
- Click Done to complete the creation of Monte Carlo's service account.
Finally, upload the JSON file you downloaded into Monte Carlo's onboarding wizard to finalize the integration.
Permissions Monte Carlo requires
bigquery.datasets.get
bigquery.datasets.getIamPolicy
bigquery.jobs.get
bigquery.jobs.list
bigquery.jobs.listAll
bigquery.jobs.create
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
storage.buckets.list
storage.buckets.get
storage.objects.list
storage.objects.get
resourcemanager.projects.get
Monitoring multiple projects using Monte Carlo
Allowing Monte Carlo's service account to access multiple projects can help with the following use cases:
- Tracking multiple datasets spread across more than one BigQuery project.
- Tracking query logs and lineage generated by users associated with more than one project.
To add an additional project to Monte Carlo's service role:
- Select the project using the combo box on the top left of the Google Cloud Platform console.
- Create a role that grants BigQuery access with the appropriate permissions following the instructions above. Optionally, you may use the following command in your terminal to copy the role over from another project (this is much quicker!):
gcloud iam roles copy
- Under Access, go to the IAM section in your Google Cloud Platform console.
- Click the Add button, provide Monte Carlo's service account email address and click Save.
You may add any number of projects to Monte Carlo's service account.
BigQuery Permissions
BigQuery Permission | Monte Carlo Requirement |
---|---|
bigquery.datasets.get | Get metadata about a dataset. Allows Monte Carlo to see the INFORMATION_SCHEMA views as well as list out available Datasets in BigQuery. |
bigquery.datasets.getIamPolicy | Read a table's IAM policy. Allows Monte Carlo to read a list of available Datasets in BigQuery. Datasets store tables in a project. |
bigquery.jobs.get | Get data and metadata on any job (query) in BigQuery. Allows Monte Carlo to read BigQuery query logs. |
bigquery.jobs.list | List all jobs and retrieve metadata on any job submitted by any user. For jobs submitted by other users, details and metadata are redacted. Allows Monte Carlo to get Metadata on BigQuery query logs. This includes data such as start time, end time, and amount of data processed. |
bigquery.jobs.listAll | List all jobs and retrieve metadata on any job submitted by any user. Allows Monte Carlo to get Metadata on BigQuery query logs. |
bigquery.jobs.create | Run jobs (including queries) within the project. Allows Monte Carlo to run SELECT queries on the Information Schema and tables to be monitored. |
bigquery.tables.get | Get BigQuery table metadata. Allows Monte Carlo to get table Metadata. Includes table creation time, number of rows, and byte size of the data. |
bigquery.tables.getData | Get table data. This permission is required for querying table data. Allows Monte Carlo to run select queries for opt-in monitors on aggregated statistics. |
bigquery.tables.list | List tables and metadata on tables. Allows Monte Carlo to see table Metadata and see all tables in a dataset. |
resourcemanager.projects.get | Read project metadata. Allows Monte Carlo to list available GCP Projects to iterate through in order to collect BigQuery Metadata. Note that Monte Carlo Service Account can only see the list of the Projects that the Service Account has been provided access to, therefore this permission is necessary. |
External Table Permissions
The following permissions are required for External Tables in BigQuery.
BigQuery Permission | Monte Carlo Requirement |
---|---|
storage.buckets.list | List buckets in a project. Also read bucket metadata, excluding IAM policies, when listing. Allows Monte Carlo to see metadata of External Tables. |
storage.buckets.get | Read bucket metadata, excluding IAM policies, and list or read the Pub/Sub notification configurations on a bucket. Allows Monte Carlo to get metadata of External Tables. |
storage.objects.list | List objects in a bucket. Also read object metadata, excluding ACLs, when listing. Allows Monte Carlo to see metadata of External Tables. |
storage.objects.get | Read object data and metadata, excluding ACLs. Allows Monte Carlo to get metadata of External Tables. |
Limitations
- With nested struct data types, Field Lineage is not available
Updated about 1 month ago