Validation Notebook Generation (public preview)
AI-powered generation of SQL validation notebooks for dbt changes.
In previewThis feature is in preview. See Integration & Feature Lifecycles documentation for more information on what this means.
Overview
The Validation Notebook Generator is a Claude Code skill that automatically creates SQL Notebooks for validating dbt model changes. Given a GitHub PR or local dbt repository, it analyzes changed models and generates a notebook with targeted validation queries comparing baseline and development data.
Prerequisites
- Claude Code installed and configured
- mcd-skills plugin installed (provides the
monte-carlo:generate-validation-notebookskill) - MC Bridge running and connected to your warehouse (see MC Bridge setup)
- Access to the dbt repository (GitHub token for PR mode, or local clone)
Installation
Install the Monte Carlo skills plugin from within Claude Code:
/plugin marketplace add monte-carlo-data/mcd-skills
/plugin install monte-carlo@mcd-skills
This adds the monte-carlo:generate-validation-notebook skill to your Claude Code environment.
Usage
Run the skill from Claude Code:
/monte-carlo:generate-validation-notebook <target>
Where <target> is either:
- A GitHub PR URL:
https://github.com/your-org/dbt/pull/123 - A local directory path:
.or/path/to/dbt/repo
The skill analyzes changed dbt models, generates validation SQL, builds a notebook YAML, and outputs an import URL that opens the notebook directly in Monte Carlo.
What Gets Generated
The generated notebook includes:
- Parameter cells β
prod_dbanddev_dbparameters for switching between your production and development databases - Markdown summary β overview of the PR, changed models, and what each validation checks
- SQL validation queries β targeted queries organized by validation pattern
Notebook Structure
1. Parameter: prod_db (e.g., ANALYTICS)
2. Parameter: dev_db (e.g., PERSONAL_JSMITH)
3. Markdown: PR summary and changed models
4. SQL queries: Validation checks per model
Validation Patterns
The generator produces different queries depending on the nature of each model change:
For Modified Models
| Pattern | Description | When Generated |
|---|---|---|
| Row Count | Total row count in prod | Always |
| Row Count Comparison | Compare row counts between prod and dev | Always |
| Segmentation Distribution | Distribution of top segmentation fields | Always |
| Changed Field Distribution | Value distribution for modified columns | When columns are changed |
| Before/After Comparison | Compare field distributions between prod and dev | When columns are changed |
| NULL Rate Check | NULL counts and percentages for new/modified columns | When new columns or COALESCE changes detected |
| Uniqueness Check | Duplicate detection on unique keys | When model has unique_key or is incremental |
| Time-Axis Continuity | Daily row counts over time | When a time axis column is detected |
| Sample Data Preview | Preview rows from the table | Always |
For New Models
New models skip comparison queries (no baseline exists) and instead focus on:
- Row count and segmentation distribution against
dev_dbonly - NULL rate check across all output columns
- Sample data preview
Customization
Additional Instructions
You can provide extra context when invoking the skill:
/monte-carlo:generate-validation-notebook https://github.com/org/dbt/pull/123
Additional instructions: Focus on the revenue columns and check for negative values.
The AI incorporates your instructions into the generated validation queries.
Schema Resolution
The generator automatically resolves output schemas for each dbt model using:
- Model-level
{{ config(schema='...') }}overrides dbt_project.ymlpath-based routing rules- dbt's default
generate_schema_namebehavior (e.g., custom schemastagebecomesPROD_STAGE)
Updated about 3 hours ago
