Validation Notebook Generation (public preview)

AI-powered generation of SQL validation notebooks for dbt changes.

📘
In preview
This feature is in preview. See Integration & Feature Lifecycles documentation for more information on what this means.

Overview

The Validation Notebook Generator is a Claude Code skill that automatically creates SQL Notebooks for validating dbt model changes. Given a GitHub PR or local dbt repository, it analyzes changed models and generates a notebook with targeted validation queries comparing baseline and development data.

Prerequisites

Claude Code installed and configured
mc-generate-validation-notebook plugin installed (from the mcd-agent-toolkit marketplace)
MC Bridge running and connected to your warehouse (see MC Bridge setup)
Access to the dbt repository (GitHub token for PR mode, or local clone)

Installation

Install the Monte Carlo validation notebook plugin from within Claude Code:

/plugin marketplace add monte-carlo-data/mcd-agent-toolkit
/plugin install mc-generate-validation-notebook@mcd-agent-toolkit

This adds the mc-generate-validation-notebook:generate-validation-notebook skill to your Claude Code environment.

Usage

Run the skill from Claude Code:

/mc-generate-validation-notebook:generate-validation-notebook <target>

Where <target> is either:

A GitHub PR URL: https://github.com/your-org/dbt/pull/123
A local directory path: . or /path/to/dbt/repo

The skill analyzes changed dbt models, generates validation SQL, builds a notebook YAML, and outputs an import URL that opens the notebook directly in Monte Carlo.

What Gets Generated

The generated notebook includes:

Parameter cells — prod_db and dev_db parameters for switching between your production and development databases
Markdown summary — overview of the PR, changed models, and what each validation checks
SQL validation queries — targeted queries organized by validation pattern

Notebook Structure

1. Parameter: prod_db (e.g., ANALYTICS)
2. Parameter: dev_db (e.g., PERSONAL_JSMITH)
3. Markdown: PR summary and changed models
4. SQL queries: Validation checks per model

Validation Patterns

The generator produces different queries depending on the nature of each model change:

For Modified Models

Pattern	Description	When Generated
Row Count	Total row count in prod	Always
Row Count Comparison	Compare row counts between prod and dev	Always
Segmentation Distribution	Distribution of top segmentation fields	Always
Changed Field Distribution	Value distribution for modified columns	When columns are changed
Before/After Comparison	Compare field distributions between prod and dev	When columns are changed
NULL Rate Check	NULL counts and percentages for new/modified columns	When new columns or COALESCE changes detected
Uniqueness Check	Duplicate detection on unique keys	When model has unique_key or is incremental
Time-Axis Continuity	Daily row counts over time	When a time axis column is detected
Sample Data Preview	Preview rows from the table	Always

For New Models

New models skip comparison queries (no baseline exists) and instead focus on:

Row count and segmentation distribution against dev_db only
NULL rate check across all output columns
Sample data preview

Customization

Additional Instructions

You can provide extra context when invoking the skill:

/monte-carlo:generate-validation-notebook https://github.com/org/dbt/pull/123

Additional instructions: Focus on the revenue columns and check for negative values.

The AI incorporates your instructions into the generated validation queries.

Schema Resolution

The generator automatically resolves output schemas for each dbt model using:

Model-level {{ config(schema='...') }} overrides
dbt_project.yml path-based routing rules
dbt's default generate_schema_name behavior (e.g., custom schema stage becomes PROD_STAGE)

Updated 30 days ago

Did this page help you?