Validation Notebook Generation (public preview)

AI-powered generation of SQL validation notebooks for dbt changes.

πŸ“˜

In preview

This feature is in preview. See Integration & Feature Lifecycles documentation for more information on what this means.

Overview

The Validation Notebook Generator is a Claude Code skill that automatically creates SQL Notebooks for validating dbt model changes. Given a GitHub PR or local dbt repository, it analyzes changed models and generates a notebook with targeted validation queries comparing baseline and development data.

Prerequisites

  • Claude Code installed and configured
  • mcd-skills plugin installed (provides the monte-carlo:generate-validation-notebook skill)
  • MC Bridge running and connected to your warehouse (see MC Bridge setup)
  • Access to the dbt repository (GitHub token for PR mode, or local clone)

Installation

Install the Monte Carlo skills plugin from within Claude Code:

/plugin marketplace add monte-carlo-data/mcd-skills
/plugin install monte-carlo@mcd-skills

This adds the monte-carlo:generate-validation-notebook skill to your Claude Code environment.

Usage

Run the skill from Claude Code:

/monte-carlo:generate-validation-notebook <target>

Where <target> is either:

  • A GitHub PR URL: https://github.com/your-org/dbt/pull/123
  • A local directory path: . or /path/to/dbt/repo

The skill analyzes changed dbt models, generates validation SQL, builds a notebook YAML, and outputs an import URL that opens the notebook directly in Monte Carlo.

What Gets Generated

The generated notebook includes:

  • Parameter cells β€” prod_db and dev_db parameters for switching between your production and development databases
  • Markdown summary β€” overview of the PR, changed models, and what each validation checks
  • SQL validation queries β€” targeted queries organized by validation pattern

Notebook Structure

1. Parameter: prod_db (e.g., ANALYTICS)
2. Parameter: dev_db (e.g., PERSONAL_JSMITH)
3. Markdown: PR summary and changed models
4. SQL queries: Validation checks per model

Validation Patterns

The generator produces different queries depending on the nature of each model change:

For Modified Models

PatternDescriptionWhen Generated
Row CountTotal row count in prodAlways
Row Count ComparisonCompare row counts between prod and devAlways
Segmentation DistributionDistribution of top segmentation fieldsAlways
Changed Field DistributionValue distribution for modified columnsWhen columns are changed
Before/After ComparisonCompare field distributions between prod and devWhen columns are changed
NULL Rate CheckNULL counts and percentages for new/modified columnsWhen new columns or COALESCE changes detected
Uniqueness CheckDuplicate detection on unique keysWhen model has unique_key or is incremental
Time-Axis ContinuityDaily row counts over timeWhen a time axis column is detected
Sample Data PreviewPreview rows from the tableAlways

For New Models

New models skip comparison queries (no baseline exists) and instead focus on:

  • Row count and segmentation distribution against dev_db only
  • NULL rate check across all output columns
  • Sample data preview

Customization

Additional Instructions

You can provide extra context when invoking the skill:

/monte-carlo:generate-validation-notebook https://github.com/org/dbt/pull/123

Additional instructions: Focus on the revenue columns and check for negative values.

The AI incorporates your instructions into the generated validation queries.

Schema Resolution

The generator automatically resolves output schemas for each dbt model using:

  1. Model-level {{ config(schema='...') }} overrides
  2. dbt_project.yml path-based routing rules
  3. dbt's default generate_schema_name behavior (e.g., custom schema stage becomes PROD_STAGE)