📘

Prerequisites

Requires permission to create IAM roles and policies in AWS.

To connect Monte Carlo to an AWS Glue metadata store, follow these steps:

  1. Create a role that allows Glue access for Monte Carlo's data collector.
  2. Provide the role's information to Monte Carlo to validate and complete the integration.

Creating an IAM role for Glue access

In order to provide access to your Glue data catalog, you will create an IAM role with the necessary API permissions:

  1. Copy the policy below, replacing the following placeholders:
  • <account-id>: the AWS account id.
  • <data-bucket>: the S3 bucket storing the data for your tables - if more than one, just add the others to the resource list as well.
  • <region>: you can specify an AWS region or * for the default region.
  • <database-name>: you can specify a name or * to give Monte Carlo access to all databases.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::<data-bucket>"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": [
                "arn:aws:s3:::<data-bucket>/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "glue:GetConnections",
            "Resource": [
                "arn:aws:glue:<region>:<account-id>:catalog",
                "arn:aws:glue:<region>:<account-id>:connection/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "glue:GetDatabases",
            "Resource": [
                "arn:aws:glue:<region>:<account-id>:catalog",
                "arn:aws:glue:<region>:<account-id>:database/<database-name>"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "glue:GetTables",
                "glue:GetTable",
                "glue:GetPartitions",
                "glue:GetPartition"
            ],
            "Resource": [
                "arn:aws:glue:<region>:<account-id>:catalog",
                "arn:aws:glue:<region>:<account-id>:database/<database-name>",
                "arn:aws:glue:<region>:<account-id>:table/<database-name>/*"
            ]
        }
    ]
}
  1. Follow the steps outlined here to create the IAM role. You will attach the policy from step 1 to this role as part of the process.

Providing role information to Monte Carlo

You will provide connection details for Glue using Monte Carlo's CLI:

  1. Please follow this guide to install and configure the CLI.
  2. Please use the command montecarlo integrations add-glue to set up Glue connectivity. For reference, see help for this command below:
$ montecarlo integrations add-glue --help
Usage: montecarlo integrations add-glue [OPTIONS]

  Setup a Glue integration. For metadata.

Options:
  --region TEXT        Glue catalog region. If not specified the region the
                       collector is deployed in is used.

  --role TEXT          Assumable role ARN to use for accessing AWS resources.
  --external-id TEXT   An external id, per assumable role conditions.
  --collector-id UUID  ID for the data collector. To disambiguate accounts
                       with multiple collectors.

  --skip-validation    Skip all connection tests. This option cannot be used
                       with 'validate-only'.

  --validate-only      Run connection tests without adding. This option cannot
                       be used with 'skip-validation'.

  --auto-yes           Skip any interactive approval.  [default: False]
  --option-file FILE   Read configuration from FILE.
  --help               Show this message and exit.

Did this page help you?