dbt Model Generation
Overview
This section covers the automated generation of dbt models and YAML documentation from schema mappings using Chicory AI agents. The system creates production-ready dbt artifacts following best practices.
Generated Artifacts
1. dbt Model SQL
The Chicory agent generates complete dbt model SQL files with:
Proper source references
Column transformations
Data quality logic
Performance optimizations
Example Generated Model:
{{
config(
materialized='table',
tags=['auto-generated', 'dimension', 'customer'],
description='Customer dimension table with standardized attributes'
)
}}
with source_data as (
select
customer_id,
first_name,
last_name,
email,
phone,
created_date,
updated_date
from {{ source('raw_crm', 'customers') }}
),
transformed as (
select
-- Surrogate key
{{ dbt_utils.surrogate_key(['customer_id']) }} as customer_sk,
-- Business key
cast(customer_id as string) as customer_bk,
-- Standardized names
initcap(trim(first_name)) as first_name,
initcap(trim(last_name)) as last_name,
trim(concat(
initcap(trim(first_name)),
' ',
initcap(trim(last_name))
)) as full_name,
-- Contact information
lower(trim(email)) as email_address,
regexp_replace(phone, r'[^\d]', '') as phone_number,
-- Temporal attributes
cast(created_date as date) as customer_created_date,
cast(updated_date as timestamp) as last_updated_ts,
-- Audit columns
current_timestamp() as created_at,
current_timestamp() as updated_at,
true as is_active
from source_data
)
select * from transformed2. YAML Documentation
Comprehensive YAML documentation with tests and metadata:
Generated Model Types
1. Dimension Tables
For dimension tables, the agent generates:
Surrogate keys using dbt_utils
SCD Type 2 structure (when applicable)
Standardized attribute transformations
Comprehensive business key mapping
2. Fact Tables
For fact tables, the agent includes:
Foreign key relationships
Measure calculations
Grain documentation
Aggregation logic
3. Staging Tables
For staging models, the agent provides:
Basic data type casting
Column renaming for consistency
Initial data quality checks
Source system documentation
Customization Options
1. Model Configuration
Customize generated models through dbt_project.yml:
2. Transformation Templates
Create custom transformation templates for specific patterns:
3. Testing Standards
Define organization-specific testing standards:
Advanced Features
1. Incremental Models
For large datasets, the agent can generate incremental models:
2. Slowly Changing Dimensions
Generate SCD Type 2 logic for dimension tables:
3. Data Quality Monitoring
Generate data quality monitoring models:
Integration with dbt Packages
1. dbt-utils Integration
The generated models leverage dbt-utils for:
Surrogate key generation
Cross-database compatibility
Common transformations
2. dbt-expectations Integration
For advanced data quality testing:
Performance Optimization
1. Partitioning and Clustering
For BigQuery targets:
2. Query Optimization
The agent applies optimization techniques:
Validation and Testing
1. Model Compilation
Test generated models compile correctly:
2. Data Quality Testing
Run comprehensive tests:
3. Documentation Generation
Generate and serve documentation:
Monitoring and Maintenance
1. Model Performance Monitoring
2. Automated Alerts
Set up alerts for:
Model compilation failures
Test failures
Performance degradation
Data freshness issues
3. Model Evolution
Track model changes over time:
Next: Testing & Validation
Last updated