📝
Prerequisites

You are an admin in AWS.

You have admin permissions in your data warehouse

You are an Account Owner.

This guide outlines how to setup an Agent (with object storage) and the OpenTelemetry Collector in your AWS cloud using Terraform.

These FAQs answer common questions like how to review resources and what integrations are supported.

📘
If you already have the Monte Carlo Agent deployed to your cloud vendor and want to add the OpenTelemetry Collector to your deployment, skip Step 1-2 and reference this FAQ. Afterwards, continue with Step 3.

Steps

1. Deploy the Agent

Before getting started please review the Monte Carlo AWS account your collection service is hosted in.

When provisioning resources for Monte Carlo deployments on the V2 Platform, use the Collection AWS account id provided in the Account information page. Accounts created after April 24th, 2024, will automatically be on the V2 platform or newer.

If you are using an older version of the platform, please contact your Monte Carlo representative for the ID.

Specifying a VPC is required to run the agent and enables certain connectivity scenarios like when you have an IP allowlist for your resource, want to peer, or deploy in your existing VPC. See more details here.

Deploy with Terraform

Monte Carlo has provided terraform modules that can be used to deploy the MC Agent and the OpenTelemetry Collector with shared storage to AWS. You can review a full example of how to use these modules in the mcd-public-resources public Github repository.

Please make sure you have the terraform CLI installed and an active session to your AWS account available in your terminal.

# Monte Carlo Agent Module
module "agent" {
  source  = "monte-carlo-data/mcd-agent/aws"
  version = "1.0.3"

  cloud_account_id  = var.cloud_account_id
  private_subnets   = var.existing_subnet_ids
  image             = var.agent_image_uri
  region            = var.region
  remote_upgradable = var.remote_upgradable
}

# OpenTelemetry Collector Module
module "opentelemetry_collector" {
  source  = "monte-carlo-data/otel-collector/aws"
  version = "0.4.3"

  deployment_name                = "Provide any name for the deployment"
  existing_vpc_id                = "Proivde a VPC ID from your AWS account"
  existing_subnet_ids            = ["Provide at least two private subnet IDs from your AWS account"]
  existing_security_group_id     = "Optional, but recommended) Provide a Security Group ID allowing your AI agents to communicate with the OpenTelemetry Collector"
  telemetry_data_bucket_arn      = module.agent.mcd_agent_storage_bucket_arn
	# These remaining variables will be modified later
  external_id                    = "N/A"
  external_access_principal      = "N/A"
  external_access_principal_type = "AWS"
  
  # set this flag to 'true' to deploy the neccesary AWS resources if you're using Glue & Athena as your warehous
	deploy_athena_resources = false
}

Initialize Terraform:

terraform init

Create the Terraform plan and review the output:

terraform plan

Apply the Terraform plan:

terraform apply

Validate the deployed occurred successfully by reviewing the command output and by using the AWS Console to locate the newly created resources.

If you are using an older version of the platform, please contact your Monte Carlo representative for the ID.

Note that the AWS account ID is not the same account where you will deploy the agent. It is important to make sure this ID is the one you select as the "Monte Carlo AWS Account ID" parameter when deploying the agent as registration will fail otherwise.

📘
Note: by default the OpenTelemetry Collector deployed with the Monte Carlo Agent will only allow ingress from resources associated with the SecurityGroup created by the CF template. You might need to provide the existing_security_group_id parameter to the stack in order for the OpenTelemetry Collector to receive incoming traces from your Agent. These SecurityGroup definition will vary depend on your network configuration.

If you wish to use an existing S3 bucket to store the OpenTelemetry trace data, specify the ARN of the existing bucket in parameter telemetry_data_bucket_arn. By default the Data Store S3 bucket created by the MC Agent deployment will be used to store the OpenTelemetry trace data unless this parameter is specified.

2. Register the Agent

After deploying the agent you can register either via the Monte Carlo UI or CLI.

And see here for examples on how to retrieve deployment output (i.e. registration input).

UI

👍
If you are onboarding a new account, you can also register by following the steps on the onboarding screen

Navigate to settings/integrations/agents and select the Create button.
Follow the onscreen wizard for the "AWS" Platform and "Data Store + Agent" Type.

Monte Carlo Registration Wizard UI Example

CLI

Use montecarlo agents register-aws-agent to register.

See reference documentation here. And see here for how to install and configure the CLI. For instance:

montecarlo agents register-aws-agent \
  --lambda-arn arn:aws:lambda:us-east-1:123456789:function:mcd-agent-AgentLambda \
  --assumable-role arn:aws:iam::123456789:role/mcd-agent-InvocationRole-12345 \
  --external-id f3840b31-772e-4fe3-8a5f-3aa5ff7e6fec

3. Configure your Data Warehouse Ingestion Pipeline

📘
Prerequisite: Data Warehouse S3 Access Configuration
Before continuing, your data warehouse must be configured to access the AWS S3 bucket containing the OpenTelemetry trace data. If your data warehouse is not currently configured to access the S3 bucket, refer to the guides below for Monte Carlo's recommendation on how to configure S3 access in your data warehouse.
Warehouse Vendor Guide
Snowflake Configure Snowflake Storage Integration and Stage
Databricks Configure Databricks External Location
Athena N/A

Warehouse Vendor	Guide
Snowflake	Configure Snowflake Storage Integration and Stage
Databricks	Configure Databricks External Location
Athena	N/A

Next, we need to configure your data ingestion pipeline to write the OpenTelemetry trace data from S3 to your data warehouse so it can be monitored by Monte Carlo. Follow the guide relevant to your data warehouse vendor for steps on how to configure this pipeline.

Warehouse Vendor	Guide
Snowflake	Configure Snowflake Snowpipe
Databricks	Configure Databricks Delta Live Table
Athena	Configure Glue Crawler

4. Configure your AI Agent

Congrats! You have now configured the Monte Carlo Agent and OpenTelemetry Collector to process traces from your AI agent and write them to your data warehouse.

The final step is to configure your AI agent to begin sending traces to the OpenTelemetry Collector.

Add the Monte Carlo OpenTelemetry SDK to your AI agent's source code.
Use the opentelemetry_collector_http_endpoint output from the Terraform deployment earlier as the URL to provide to the Monte Carlo OpenTelemetry SDK.
Follow the Monte Carlo OpenTelemetry SDK library's instructions to configure instrumentation.

You can now validate the deployment is working as expected by observing files being written to the S3 bucket and data being ingested into the relevant table in your warehouse.

You can begin creating Agent Monitors in Monte Carlo following the instructions here.

FAQs

What if I already deployed the Monte Carlo Agent?

If you already have the Monte Carlo Agent deployed to your cloud vender, you can deploy the OpenTelemetry Collector separately alongside it via Terraform.

Before getting started please review the Monte Carlo AWS account your collection service is hosted in.

If you are using an older version of the platform, please contact your Monte Carlo representative for the ID.

Specifying a VPC is required to run the collector and enables certain connectivity scenarios like when you have an IP allowlist for your resource, want to peer, or deploy in your existing VPC. See more details here. Be sure to use the same VPC associated to your Monte Carlo Agent.

Monte Carlo has provided a terraform module that can be used to deploy the OpenTelemetry Collector to AWS. You can review and use this module from the public Github repository.

Please make sure you have the terraform CLI installed (version >=1.9.0) and an active session to your AWS account available in your terminal. You can use the monte-carlo-data/otel-collector/aws module like this:

module "opentelemetry_collector" {
  source  = "monte-carlo-data/otel-collector/aws"
  version = "0.1.1"

  deployment_name                = "Provide any name for the deployment"
  existing_vpc_id                = "Proivde a VPC ID from your AWS account"
  existing_subnet_ids            = ["Provide at least two private subnet IDs from your AWS account"]
  telemetry_data_bucket_arn      = "Provide the ARN of an existing S3 bucket to store telemetry trace data"
  existing_security_group_id     = "(Optional, but recommended) Provide a Security Group ID allowing your AI agents to communicate with the OpenTelemetry Collector"
	# These remaining variables will be modified later
  external_id                    = "N/A"
  external_access_principal      = "N/A"
  external_access_principal_type = "AWS"
}

Initialize Terraform:

terraform init

Create the Terraform plan and review the output:

terraform plan

Apply the Terraform plan:

terraform apply

Validate the deployed occurred successfully by reviewing the command output and by using the AWS Console to locate the newly created resources.

If you are using an older version of the platform, please contact your Monte Carlo representative for the ID.

📘
Note: by default the OpenTelemetry Collector deployed with the Monte Carlo Agent will only allow ingress from resources associated with the SecurityGroup created by the CF template. You might need to provide the existing_security_group_id parameter to the stack in order for the OpenTelemetry Collector to receive incoming traces from your Agent. These SecurityGroup definition will vary depend on your network configuration.

Continue with Step 3 above.

Can I review agent resources and code?

Absolutely! You can find details here:

Component	Repository	Target
Code	https://github.com/monte-carlo-data/apollo-agent	https://hub.docker.com/r/montecarlodata/agent*
Terraform Module	https://github.com/monte-carlo-data/terraform-aws-otel-collector	https://registry.terraform.io/modules/monte-carlo-data/otel-collector/aws/latest
Terraform Resources	https://github.com/monte-carlo-data/mcd-public-resources/tree/main/templates/terraform/aws_agent_with_opentelemetry_collector	N/A

*Note that due to an AWS limitation the agent image is also uploaded and then sourced from AWS ECR when executed on Lambda.

Repository: 590183797493.dkr.ecr.*.amazonaws.com/mcd-agent

What additional AWS resources are deployed for Athena warehouse ingestion?

Monte Carlo supports monitoring a Glue table containing your AI Agent traces via an Athena integration. In order to write traces to a Glue table that is queryable from Athena, additional AWS resources must be deployed. If you're using Terraform, these additional resources can be deployed by setting variable deploy_athena_resources = true on the monte-carlo-data/otel-collector/aws Terraform module. If you're using CloudFormation you can deploy these same resources by using the additional CloudFormation stack template provided here.

Additional resources:

S3 Resources:
- SQS Queue: Subscribes to the SNS topic to receive notifications when new data arrives
- SNS Topic: (Optional) Created automatically if not provided. Receives S3 event notifications
- S3 Bucket Notifications: (Optional) Created automatically if SNS topic is not provided. Configures S3 to publish events to the SNS topic
Glue Resources:
- Glue Classifier: A grok classifier for parsing telemetry data
- IAM Role for Glue Crawler: Grants permissions to access S3, read from SQS, and use AWS Glue services
- AWS Glue Crawler: Automatically processes new telemetry data in S3 and creates/updates tables in the Glue Data Catalog
Lambda Resources:
- Lambda UDF function: A lambda invokable from Athena to access Bedrock models for LLM evaluations
- IAM Role for Lambda UDF: Providing the lambda with access to Bedrock

To review the details of the additional Athena AWS resources, please review these resources:

Additional FAQs?

Other applicable FAQs for deploying the Monte Carlo Agent to AWS can be found here.