Google Cloud: Data Store Deployment (Beta)

How-to create and register

πŸ“

Prerequisites

  1. You are an admin in GCP (for steps 1 and 2).
  2. You are an Account Owner (for step 3).

This guide outlines how to setup a Data Store for storing troubleshooting and temporary data in your GCP cloud.

Steps

πŸ‘

Automation with Infrastructure as Code (IaC)

This config can be used to automate steps 1 and 2 and manage resources as code with Terraform: https://mcd-public-resources.s3.amazonaws.com/terraform/gcs_data_store.tf

If you wish to use it you can download, review, and then deploy in your GCP account.

Note that this will persist a key in the remote state used by Terraform. Please take appropriate measures to protect your remote state.

1. Create a GCS Bucket

Use the GCP Console, CLI or favorite IaC tool to create a new GCS Bucket in your GCP account with no public access. Note that registration (step 3) will fail if the bucket is publicly accessible.

We strongly recommend you do not use an existing bucket or share with other jobs as Monte Carlo might overwrite existing data. And we also recommend the following settings:

  • Google-managed encryption
  • Expiration lifecycles. 90 days or less for all objects in the following prefixes:
custom-sql-output-samples/
rca
idempotent

2. Create an IAM Role and Service Account

First, create a role in the same project as above:

  1. Under IAM & Admin, go to the Roles section in your Google Cloud Platform console.
  2. Click the "Create Role" button at the top of the tab.
  3. Give the new role a name. We recommend "Monte Carlo Data Store".
  4. Change the Role launch stage to "General Availability".
  5. Click "Add Permissions" and add the permissions specified below and select "Create".
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
storage.objects.update
storage.buckets.get
storage.buckets.getIamPolicy

Now, create a service account in the same project as above:

  1. Under IAM & Admin, go to the Service Accounts section in your Google Cloud Platform console.
  2. Click the "Create Service Account" button at the top of the tab.
  3. Give the account a name and continue. We recommend "monte-carlo-data-store-sa".
  4. Click "Done" to complete the creation of Monte Carlo's service account.
  5. Then navigate to the service account you just created and create a JSON key. A JSON file will download – please keep it safe.

Finally, add the role and principal to the bucket you created above:

  1. Under Cloud Storage, select the bucket you created in step 1 and navigate to the Permissions tab.
  2. Select "Grant Access" and fill in the service account you just created as the principal and role you just created as the role. Save changes.

3. Register the Data Store

After creating the bucket and role you can register either via the Monte Carlo UI or CLI.

After this step is complete all integrations that you add to this collection service will automatically use this bucket for storing troubleshooting and temporary data. You can add these integrations as you normally would using Monte Carlo's UI wizard or CLI.

UI

πŸ‘

If you are onboarding a new account, you can also register by following the steps on the onscreen


  1. Navigate to settings/integrations/agents and select the Create button.
  2. Follow the onscreen wizard for the "GCP" Platform and "Data Store" Type. The "Bucket name" is the bucket you created in step one and "credentials JSON" are from step two.
GCP Data Store Creation Wizard

GCP Data Store Registration Wizard

CLI

Use the command montecarlo agents register-gcs-store to register.

For reference on this command please see here. And see here for how to install and configure the CLI.

The bucket-name is the bucket you created in step one and key-file is the service account key from step two.

montecarlo agents register-gcs-store \
  --bucket-name example-bucket \
  --key-file example.json

FAQs

Can I deploy resources as code?

Absolutely! This config can be used to automate steps 1 and 2 and manage resources as code with Terraform (source).

If you wish to use it you can download, review, and then deploy in your GCP account.

Note that this will persist a key in the remote state used by Terraform. Please take appropriate measures to protect your remote state

Can I further constraint access to this Data Store (GCS Bucket)?

πŸ‘

Updated IPs

For all accounts created after April 24th, 2024 the Monte Carlo platform will generally use the following IP addresses to connect to your integration (cloud-only), agent (GCP and Azure), and/or object store:

  • 34.200.118.118
  • 35.169.25.209

Please be sure to allowlist both as requests from the Monte Carlo Platform* can originate from either one. If your account was created before this date, please reach out to your Monte Carlo representative.

*If you are leveraging a Customer-hosted Agent these are not the same as the IP addresses that the agent will use to connect to your resource. See "Egress" FAQs per platform for more details and options to constraint outbound access.

Absolutely! By default this is done via the service account key, but if you prefer you can further restrict requests via an IP allowlist. For instance you can:

  1. Reach out to your Monte Carlo representative or support at [email protected] for an IP Address to allowlist. All inbound requests to the GCS data store will originate here.
  2. Create a new GCP Project for Monte Carlo. We recommend you do not share this project.
  3. Create a GCS bucket and Role with Service Account in the project created in step #2. You can do this by following the steps here or via this automation.
  4. Create an Access Level (e.g. via the Access Context Manager) to include the IP address from step #1 and any other IP addresses or members you'd want to be able to administer or access this project.
  5. Create a Service Perimeter with the Access Level from step #4 and the "Cloud Storage" service. If you prefer you can first create this in "Dry run" mode to validate.
  6. Continue with registration.

How do I check the reachability between Monte Carlo and the Data Store?

Reachability is automatically validated during registration, but you can also use this CLI command or "test" button on the UI to test anytime.