GCP: BigQuery Configuration

How-to configure BigQuery for OpenTelemetry instrumentation collection

This guide contains instructions on how to configure BigQuery to ingest AI agent traces from a PubSub Subscription into a table that can be monitored by Monte Carlo.

Configure BigQuery Table

Now that you have the OpenTelemetry Collector deployed along with a PubSub Topic in your GCP account, we can configure a BigQuery table and a PubSub Subscription to write messages from the Topic to the new table.

Run this SQL to create the output table in BigQuery:

-- Create output table for agent traces
CREATE TABLE IF NOT EXISTS `my_project.my_dataset.my_table`
(
  message_id STRING,
  subscription_name STRING,
  attributes JSON,
  publish_time TIMESTAMP,
  data BYTES
)
PARTITION BY DATE(_PARTITIONTIME);

Now, return to your Terraform deployment of the OpenTelemetry Collector and provide the identifier of this newly created table (e.g. my_project.my_dataset.my_table) in variable bigquery_table_id.

This will create the PubSub Subscription associated to the Topic and configured to write to your BigQuery Table.

module "opentelemetry_collector" {
  source  = "monte-carlo-data/otel-collector/google"
  version = "0.1.1"
  ...
  # Set this variable to your table identifier:
  bigqeury_table_id = "my_project_my_dataset.my_table"
}
  1. Initialize Terraform:
terraform init
  1. Create the Terraform plan and review the output:
terraform plan
  1. Apply the Terraform plan:
terraform apply
  1. Validate the deployed occurred successfully by reviewing the command output and by using the GCP Console to locate the newly created resources.

FAQs

How can I route traces from one agent to a different table in BigQuery?

If you have multiple agents send traces to the OpenTelemetry Collector and wish for one or more of these agents' traces to be written to a different BigQuery table, you can achieve this by creating additional PubSub Subscriptions.

First, you must set the service.name attribute in your OpenTelemetry traces to the name of your agent. In the montecarlo-opentelemetry Python lib, this is accomplished by providing the agent_name property to the mc.setup(...) method. The OpenTelemetry Collector will include the value of this attribute in the S3 file path of your trace data.

Next, create an additional PubSub Subscription for each agent you wish to be written to a different table. When creating the Subscription, use the same upstream Topic and specify a Transform using the template below. This Transform will only process traces from the agent specified in the Transform script. Configure the Subscription to write to the alternative BigQuery table. Repeat these steps for any additional agents.

function filterByServiceName(message, metadata) {
  // Service name to include in this subscription
	// TODO: update this!
  const desiredServiceName = 'my-ai-agent';

  // Find the service.name attribute from the first span
  const serviceNameAttr = JSON.parse(message.data).resourceSpans[0].resource.attributes.find(
    attr => attr.key === "service.name"
  );
  
  // Check if service.name exists and equals the desired service name above
  if (!serviceNameAttr || 
      !serviceNameAttr.value || 
      !serviceNameAttr.value.stringValue || 
      serviceNameAttr.value.stringValue !== desiredServiceName) {
    return null; // Filter out messages that don't match
  }
  
  return message;
}