GitHub

Integrate Monte Carlo with your Github to gain visibility into code impact on your data

GitHub Integration

Overview

The GitHub integration allows customers to:

  1. Reduce time to resolution by easily checking potentially relevant pull requests in the context of an incident via PRs overlaid on incident charts. See more under Pull requests documentation.

  2. Prevent bad data changes before they reach production using MC Prevent β€” Monte Carlo's suite of agents that surfaces data observability context at every stage of the development lifecycle.

  3. Get context on tables via reviewing recent pull request history on the asset page (requires dbt integration).


MC Prevent

MC Prevent is Monte Carlo's suite of agents for preventing bad data changes before they reach production. It covers every stage of the development lifecycle:

StageToolWhat it does
DevelopmentCode ChangeSurfaces table health, alerts, and blast radius inside your AI editor as you write code
Code reviewPR AgentReviews every pull request and posts a risk assessment comment with downstream impact analysis
CI/CDCI AgentEvaluates the PR Agent's assessment in your pipeline and returns a pass / warn / fail verdict
ValidationSQL NotebooksInteractive, cell-based SQL notebooks for running targeted validation queries against your warehouse

This doc covers the PR Agent and CI Agent. For the editor plugin, see the Code Change documentation. For validation notebooks, see the SQL Notebooks documentation.


dbt Integrations

Note: Having a dbt integration configured is no longer required for the GitHub integration, but having both provides information for mapping between dbt models and assets for code changes.

Follow docs here to set up your dbt integration.

dbt Cloud

The setup should be automatic. Follow the GitHub Integration Setup steps below to enable.

For legacy data collector customers before v14050, the remote location of dbt projects needs to be configured manually (see dbt Core below).

dbt Core

For dbt Core integrations, the remote location of each dbt project needs to be provided. Go to the Integrations settings page once you have completed the GitHub Integration setup. Under Notifications and Collaboration, the GitHub integration row will show an alert requiring more GitHub information. Click into the editing drawer via the alert or pen icon to input the missing remote URL.

Alternatively, configure the remote location of each dbt project using the GraphQL API:

mutation updateDbtProjectInfo($uuid: UUID!, $remoteUrl: String, $subdirectory: String) {
    updateDbtProjectInfo(uuid: $uuid, remoteUrl: $remoteUrl, subdirectory: $subdirectory) {
        project {
            uuid
        }
    }
}

Parameters:

ParameterDescription
uuiddbt project UUID
remoteUrle.g. [email protected]:monte-carlo-data/dbt.git
subdirectoryRoot directory of the dbt project within the repo (e.g. analytics). Only needed if the dbt project is in a subdirectory. Leave empty for projects at the repo root.

GitHub Integration Setup

You can set up the integration by installing an instance of the official MC GitHub App for your organization. If you manage multiple GitHub organizations with code relevant to data collected by MC, you need to install the app for each organization.

Permissions required:

  • Read access to administration and metadata
  • Read and write access to issues and pull requests
  • Repository checks (read and write)
  • Repository contents (read-only)

Steps:

  1. In Monte Carlo, go to Settings β†’ Integrations
  2. Click Add
  3. Under the Code Repositories section, select GitHub
  4. In the GitHub UI, select:
    • The organization
    • Optionally, the specific repositories accessible to MC
  5. Click Install and Authorize
  6. You will be redirected back to Monte Carlo

If you have the owner role in GitHub, the integration will appear immediately in Settings β†’ Integrations β†’ Notifications and Collaborations.

If you are not the owner, a request is sent to the GitHub account owner for approval. The integration will appear once they approve.

Once set up, MC will start collecting pull requests that merged after the integration setup time. MC does not have access to historical PRs.


PR Agent

Feature flag required

The PR Agent requires the pr_agent flag to be enabled on your account. This flag is available in Monte Carlo Settings > AI Agents.

If the GitHub integration is installed but pr_agent is not enabled, you will continue to receive the standard downstream impact report on PRs (if that feature is enabled for your account) β€” the PR Agent will not trigger. The CI Agent requires the PR Agent to be enabled; if pr_agent is not enabled and the CI Agent is configured, the CI check will pass after polling times out β€” but no risk assessment will be produced.


Once the GitHub integration is installed and pr_agent is enabled, Monte Carlo automatically reviews every pull request and posts a risk assessment.

The PR Agent evaluates:

  • Which Monte Carlo-monitored tables are affected by the PR's file changes
  • Downstream lineage and blast radius
  • Active alerts on impacted tables

Triggers:

  • Automatically when a PR is opened
  • Commenting mc review on the PR will retrigger the agent

Output:

Monte Carlo posts a comment on the PR with a plain-language risk summary.


CI Agent

The CI Agent is an optional GitHub Action that acts as a pipeline gate. It evaluates the PR Agent's risk assessment and returns a pass, warn, or fail verdict as a GitHub Check Run.

When combined with GitHub branch protection rules, the CI Agent can block high-risk PRs from merging until they are reviewed or overridden.

PR Agent vs. CI Agent

The PR Agent runs automatically once the GitHub integration is installed and pr_agent is enabled β€” you get AI risk assessment comments on every PR with no extra configuration.

The CI Agent is the optional next step: add the GitHub Action to your CI pipeline if you want a formal check that can block merges based on the PR Agent's assessment. The PR Agent must be enabled as a prerequisite β€” the CI Agent reads its assessment to produce a verdict.

Setup for GitHub Actions

See the mc-prevent-action README for setup instructions.

Setup for CircleCI

See the mc-prevent-orb repositoryREADME for setup instructions.

How it works

  1. MC Prevent detects the pull request from CircleCI environment variables
  2. Calls the Monte Carlo MC Prevent API with the repo, PR number, and commit SHA
  3. If no assessment is available yet (the PR agent may still be analyzing), waits up to max-wait seconds
  4. If a cached verdict from a previous commit exists, reuses it immediately
  5. Displays the verdict and a human-readable summary explaining the risk
  6. Raw API response available in a separate collapsed step ("Raw API response")

Verdicts

MC Prevent returns one of three verdicts based on the risk assessment:

VerdictWhat it meansCI job behavior (fail-on-error: true)Check run on PR
passNo significant risk detectedJob passes (green)Green
warnRisk detected β€” review recommendedJob fails (red), step auto-expandsGrey (neutral)
failHigh risk β€” merge not recommendedJob fails (red), step auto-expandsRed

Note on CI job vs check run: The CI job can only show green or red. The "MC Prevent CI Gate Result" check run posted on the PR shows the actual severity β€” green for pass, grey for warn, red for fail. If you configure branch protection, require the check run (not the CI job) for accurate gating.

How the verdict is calculated

MC Prevent receives a risk assessment from the MC PR Agent for each data asset affected by the PR. It evaluates each asset against a decision matrix and takes the worst verdict across all assets.

Decision matrix β€” rules are evaluated top-to-bottom, first match wins:

#ConditionVerdict
1Breaking change AND downstream key assets depend on itfail
2Active alerts highly correlated with the changefail
3Breaking change AND no key assets downstreamwarn
4Active alerts exist but low/no correlation with the changewarn
5No monitor coverage AND key assets downstreamwarn
6Additive change, no active alertspass
7No qualifying data assets identifiedpass

Signals used per asset β€” provided by the PR agent:

SignalDescription
change_typeHow the asset is affected: breaking or additive
alert_correlationWhether active alerts are related to the change: high, low, or none
active_alertsNumber of unresolved alerts on the asset
downstream_key_assetsKey assets (dashboards, critical tables) that depend on this asset
monitor_coverage_gapsColumns or aspects of the asset that have no monitor coverage

Multi-asset PRs: When a PR affects multiple data assets, each is evaluated independently. The final verdict is the worst across all assets β€” if one asset is fail and another is pass, the PR verdict is fail.

What fail-on-error controls

SettingBehavior
fail-on-error: true (default)Warn and fail both cause the CI job to exit non-zero (red). This draws attention to risks β€” the failed step auto-expands in CircleCI so the summary is immediately visible.
fail-on-error: falseThe CI job always passes (green). The verdict is only visible in the job output and the check run on the PR. Use this for a silent, non-blocking setup.

Behavior by setup stage

MC Prevent is designed for progressive adoption. Once the allow-list is configured, it never blocks your CI due to incomplete setup β€” you can configure the remaining pieces at your own pace.

Setup stageCI resultWhat you'll see
Orb referenced, URL not allow-listedPipeline will not runThe org admin must add the URL prefix to the allow-list first β€” see Step 1.
Allow-list configured, credentials not yet setPass (green)Job skips instantly β€” no API call is made
Credentials configured, PR agent not yet enabledPass (green)Job polls for up to max-wait seconds, then passes with no assessment
Credentials configured, PR agent enabledPass / Warn / FailFull risk verdict based on the PR agent's analysis

Tip: To avoid the polling wait in stage three, enable the PR agent in Monte Carlo β†’ Settings β†’ AI Agents before (or shortly after) adding your API credentials.

Override

Add the mc-override label to your pull request to bypass MC Prevent.

  • The verdict returns pass regardless of risk
  • The "MC Prevent CI Gate Result" check run on the PR immediately flips to green β€” no commit or CI re-run needed
  • All overrides are logged for audit

Thresholds and customization

Current thresholds (e.g., >5 downstream tables = high sensitivity, >50 write queries = elevated) are fixed defaults. Different teams have different tolerances, and configurable thresholds are on the roadmap.

In the meantime, if defaults consistently produce verdicts that don't match your team's intuition, use mc-override and share feedback with your Monte Carlo account team β€” it directly informs how we tune this.

For specific troubleshooting for GitHub Actions and CircleCI, checkout their respective repositories.


FAQ

Q: I have a dbt Core integration and I'm not sure my remote URL and subdirectory are correct. What's the right format?

Remote URL should be one of:

  • https://github.com/<org>/<repo>
  • git://github.com/<org>/<repo>.git

Subdirectory should be the root directory of the dbt models within the repo, and is only needed if the dbt project is in a subdirectory. For example:

  • Model path analytics/models/foo/bar.sql β†’ subdirectory is analytics
  • Model path models/foo/bar.sql β†’ leave subdirectory empty

Q: I just set up the integration. Why don't I see pull requests showing up yet?

MC only collects PRs that merged after the integration setup time. It does not have access to historical PRs, so it may take a few hours or days for PRs to start appearing depending on how frequently your team merges.


Q: Why can't I see pull requests for schema change alerts?

Schema change monitors don't use machine learning or GitHub integration. They compare table schema hour-over-hour and identify differences β€” they will not show GitHub pull requests.


Q: Why can't I see pull requests in the asset summary page?

dbt integration is required to see pull request history on the asset page.


Q: My PRs are not showing up in Monte Carlo and my GitHub organization uses an IP allowlist. Do I need to allowlist any Monte Carlo IPs?

Yes. Add Monte Carlo's SaaS public IP address to your GitHub organization's IP allowlist. You can find the IP address on the Account Information page.


Q: The CI Agent action is waiting but never getting an assessment. What's happening?

The most common cause is that the PR Agent is not yet enabled β€” the CI Agent requires the PR Agent to run first. Confirm the flag is enabled in Monte Carlo Settings β†’ AI Agents..

If pr_agent is enabled, the PR Agent runs asynchronously after a PR is opened. The CI Agent polls every 30 seconds for up to 5 minutes by default. If no assessment is found within that window, it defaults to pass. Increase the max-wait input if your PRs consistently take longer to assess.


Q: I set up the GitHub integration but didn't add the CI Agent GitHub Action. Will I still get risk assessments?

Yes, you will receive a PR Agent risk assessment comment on every PR. The CI Agent GitHub Action is only needed if you want a pipeline gate that can block merges based on that assessment."