How Categorical Contrasts Work
This document explains the internal architecture and mechanism of categorical contrast computation in FormulaCompiler.jl, showing how the ContrastEvaluator achieves zero-allocation performance through the CounterfactualVector system.
Table of Contents
- Overview
- Simple Approach: Direct Data Modification
- CounterfactualVector Mechanism
- ContrastEvaluator Initialization
- Contrast Computation Flow
- Gradient Computation
- Performance Characteristics
- Advanced Topics
Overview
Categorical contrasts compute counterfactual discrete differences: comparing the same observation under different categorical levels.
# For a specific individual/observation at row i
Δ = X(row=i, treatment="Drug") - X(row=i, treatment="Control")This answers: "What would be the treatment effect for this specific individual, holding all other characteristics constant?"
Important: This is NOT comparing two different observations that happen to have different treatment levels (which would confound treatment effects with individual differences).
FormulaCompiler provides two approaches for computing these counterfactual contrasts:
1. Simple Approach: Direct Data Modification
For simple cases, you can modify data directly and compute differences:
# Evaluate with different categorical levels
data_control = merge(data, (treatment = fill("Control", n_rows),))
data_drug = merge(data, (treatment = fill("Drug", n_rows),))
X_control = modelrow(compiled, data_control, row)
X_drug = modelrow(compiled, data_drug, row)
Δ = X_drug - X_controlPros: Simple, straightforward, no special APIs Cons: Allocates new data structures, not suitable for batch processing
2. Zero-Allocation Approach: ContrastEvaluator + CounterfactualVectors
For performance-critical batch operations, use the zero-allocation system:
evaluator = contrastevaluator(compiled, data, [:treatment])
Δ = Vector{Float64}(undef, length(compiled))
contrast_modelrow!(Δ, evaluator, row, :treatment, "Control", "Drug")Pros: Zero allocations, optimized for batch processing Cons: More setup, requires understanding of evaluator pattern
The rest of this document focuses on Approach 2 (the zero-allocation system).
Simple Approach: Direct Data Modification
For exploratory analysis, one-off contrasts, or when performance isn't critical, you can compute contrasts by directly modifying data and evaluating the compiled formula twice.
This approach computes counterfactual contrasts: comparing the same observation under different categorical levels. This is fundamentally different from comparing two different observations that happen to have different levels.
Counterfactual (what we're doing here):
# Same person (row 5), different treatments
X_row5_if_control = modelrow(compiled, data_control, 5)
X_row5_if_drug = modelrow(compiled, data_drug, 5)
Δ_counterfactual = X_row5_if_drug - X_row5_if_controlThis answers: "What would be the effect if this person received Drug instead of Control?"
Cross-sectional (NOT what we're doing):
# Different people, different treatments
X_person_in_control = modelrow(compiled, data, row_a) # Person A (in Control)
X_person_in_drug = modelrow(compiled, data, row_b) # Person B (in Drug)
Δ_cross_sectional = X_person_in_drug - X_person_in_controlThis confounds treatment effect with individual differences!
Basic Pattern
using FormulaCompiler, GLM, DataFrames, Tables
# Fit model and compile
df = DataFrame(
y = randn(100),
x = randn(100),
treatment = rand(["Control", "Drug_A", "Drug_B"], 100)
)
model = lm(@formula(y ~ x * treatment), df)
data = Tables.columntable(df)
compiled = compile_formula(model, data)
# Choose a row to analyze
row = 1
# Create modified data with different treatment levels
n_rows = length(data.y)
data_control = merge(data, (treatment = fill("Control", n_rows),))
data_drug = merge(data, (treatment = fill("Drug_A", n_rows),))
# Evaluate model rows with different levels
X_control = modelrow(compiled, data_control, row)
X_drug = modelrow(compiled, data_drug, row)
# Compute contrast
Δ = X_drug .- X_control
# Compute effect and standard error
β = coef(model)
effect = dot(β, Δ)
# Gradient for uncertainty quantification (linear scale)
∇β = Δ # Parameter gradient: ∂(effect)/∂β = Δ
vcov_matrix = vcov(model)
se = sqrt(dot(∇β, vcov_matrix, ∇β))
println("Treatment effect: $effect ± $se")What This Does
merge(data, (treatment = fill("Control", n_rows),))- Creates a new NamedTuple with all columns from
data - Replaces the
:treatmentcolumn with a vector of all"Control"values - Crucially: Keeps all other covariates (x, age, etc.) at their original values for each row
- Memory: Allocates new vector for treatment column (~100 bytes for 100 rows)
- Creates a new NamedTuple with all columns from
modelrow(compiled, data_control, row)- Evaluates the compiled formula for row 1 using
data_control - Returns the model matrix row X for the same individual (row 1) as if they had
treatment="Control" - All other characteristics (x, age, etc.) remain as observed for row 1
- Memory: Allocates new output vector (~80 bytes for typical model)
- Evaluates the compiled formula for row 1 using
X_drug .- X_control- Element-wise subtraction: the discrete effect vector
- Shows the treatment effect for this specific individual with their specific covariate values
- Holds everything else constant (ceteris paribus)
Key insight: Both X_control and X_drug are evaluated for the same row (same person/observation), just with different treatment assignments. This is a counterfactual comparison, not a comparison across different observations.
Why Counterfactual Contrasts Matter
The counterfactual approach isolates the treatment effect by holding all other variables constant:
# Example: Effect for a 45-year-old with specific characteristics
row = 15 # Person with: age=45, x=2.3, education="College"
# Counterfactual contrast: Same person, different treatment
X_if_control = modelrow(compiled, data_control, 15) # This person AS IF Control
X_if_drug = modelrow(compiled, data_drug, 15) # This person AS IF Drug
Δ_counterfactual = X_if_drug - X_if_control
# Result: Pure treatment effect for THIS SPECIFIC person
# Holds constant: age=45, x=2.3, education="College"
effect = dot(β, Δ_counterfactual)Interpretation: "If this 45-year-old college graduate with x=2.3 received Drug instead of Control, the predicted outcome would change by effect."
Contrast with cross-sectional comparison (wrong approach):
# Find someone in Control and someone in Drug
row_control = findfirst(data.treatment .== "Control") # Person A: age=45, x=2.3
row_drug = findfirst(data.treatment .== "Drug") # Person B: age=60, x=-1.5
X_person_control = modelrow(compiled, data, row_control)
X_person_drug = modelrow(compiled, data, row_drug)
Δ_wrong = X_person_drug - X_person_control
# PROBLEM: This confounds treatment with age difference (60 vs 45)
# and x difference (-1.5 vs 2.3)!Why this matters for marginal effects:
- Discrete marginal effects measure ceteris paribus changes (holding everything else constant)
- Counterfactual contrasts implement this mathematically
- This is the standard definition in econometrics and causal inference
Computing Scalar Effects and Gradients
To get a scalar treatment effect, multiply by coefficients. For uncertainty quantification, also compute the parameter gradient:
β = coef(model)
# Discrete effect on linear predictor (η scale)
effect_eta = dot(β, Δ)
# Parameter gradient for linear scale (needed for standard errors)
# For linear scale, the gradient IS the contrast vector
∇β_eta = Δ # ∂(effect_eta)/∂β = Δ
# Standard error via delta method
vcov_matrix = vcov(model)
se_eta = sqrt(dot(∇β_eta, vcov_matrix, ∇β_eta))
# Confidence interval
ci_lower_eta = effect_eta - 1.96 * se_eta
ci_upper_eta = effect_eta + 1.96 * se_eta
println("Linear scale effect: $effect_eta ± $se_eta")
println("95% CI: [$ci_lower_eta, $ci_upper_eta]")Response scale (for GLM models with link functions):
using GLM
# Compute linear predictors
η_control = dot(β, X_control)
η_drug = dot(β, X_drug)
# Apply link function
link = GLM.LogitLink() # Example: logistic regression
μ_control = GLM.linkinv(link, η_control)
μ_drug = GLM.linkinv(link, η_drug)
# Discrete effect on response scale
effect_mu = μ_drug - μ_control
# Parameter gradient for response scale (chain rule)
# ∇β = g'(η_drug) × X_drug - g'(η_control) × X_control
g_prime_control = GLM.mueta(link, η_control) # dμ/dη at control
g_prime_drug = GLM.mueta(link, η_drug) # dμ/dη at drug
∇β_mu = g_prime_drug .* X_drug .- g_prime_control .* X_control
# Standard error via delta method
se_mu = sqrt(dot(∇β_mu, vcov_matrix, ∇β_mu))
# Confidence interval
ci_lower_mu = effect_mu - 1.96 * se_mu
ci_upper_mu = effect_mu + 1.96 * se_mu
println("Response scale effect: $effect_mu ± $se_mu")
println("95% CI: [$ci_lower_mu, $ci_upper_mu]")Why gradients matter:
- Enable uncertainty quantification (standard errors, confidence intervals)
- Essential for hypothesis testing
- Required for proper statistical inference
- Delta method: SE = √(∇β' Σ ∇β) where Σ is the covariance matrix
Multiple Rows or Levels
For multiple comparisons, use loops and include gradients for inference:
# Compare all levels to reference (with uncertainty)
reference_level = "Control"
other_levels = ["Drug_A", "Drug_B"]
for level in other_levels
data_level = merge(data, (treatment = fill(level, n_rows),))
X_level = modelrow(compiled, data_level, row)
X_ref = modelrow(compiled, data_control, row)
Δ = X_level .- X_ref
effect = dot(β, Δ)
# Gradient and standard error
∇β = Δ # For linear scale
se = sqrt(dot(∇β, vcov_matrix, ∇β))
println("Effect of $level vs $reference_level: $effect ± $se")
end
# Analyze multiple rows (with confidence intervals)
rows_of_interest = [1, 10, 50]
for row in rows_of_interest
X_control_row = modelrow(compiled, data_control, row)
X_drug_row = modelrow(compiled, data_drug, row)
Δ_row = X_drug_row .- X_control_row
effect_row = dot(β, Δ_row)
se_row = sqrt(dot(Δ_row, vcov_matrix, Δ_row))
ci_lower = effect_row - 1.96 * se_row
ci_upper = effect_row + 1.96 * se_row
println("Row $row effect: $effect_row, 95% CI: [$ci_lower, $ci_upper]")
endWhen to Use This Approach
Good for:
- Exploratory analysis (quick checks, prototyping)
- One-off comparisons (single contrasts)
- Simple scripts where performance isn't critical
- Teaching/learning (simpler to understand)
Not ideal for:
- Batch processing (computing 1000+ contrasts)
- Performance-critical code (allocations add up)
- Production pipelines (want zero-allocation guarantees)
Memory Cost Analysis
For a dataset with 1000 rows and a typical model:
Per contrast computed:
- New treatment vector: ~8KB (1000 × Float64)
- Model row output: ~80 bytes (typical)
- Intermediate allocations: ~100 bytes
- Total per contrast: ~8.2KB
For 1000 contrasts: ~8.2MB allocated
Compare to ContrastEvaluator: 0 bytes per contrast after setup.
Integration with Existing Data Workflows
This approach works naturally with DataFrames manipulation:
using DataFrames, Chain
# Create counterfactual datasets using DataFrames operations
df_control = @chain df begin
transform(:treatment => (_ -> "Control") => :treatment)
end
df_drug = @chain df begin
transform(:treatment => (_ -> "Drug_A") => :treatment)
end
# Convert to columntables and evaluate
data_control = Tables.columntable(df_control)
data_drug = Tables.columntable(df_drug)
X_control = modelrow(compiled, data_control, row)
X_drug = modelrow(compiled, data_drug, row)
Δ = X_drug .- X_controlComplete Example: Logistic Regression with Uncertainty
For GLM models with link functions, compute both linear and response scale effects:
using GLM, DataFrames, Tables
# Fit logistic regression
df = DataFrame(
outcome = rand([0, 1], 100),
age = rand(18:80, 100),
treatment = rand(["Control", "Drug"], 100)
)
model = glm(@formula(outcome ~ age * treatment), df, Binomial(), LogitLink())
data = Tables.columntable(df)
compiled = compile_formula(model, data)
# Counterfactual data
n_rows = length(data.outcome)
data_control = merge(data, (treatment = fill("Control", n_rows),))
data_drug = merge(data, (treatment = fill("Drug", n_rows),))
# Analyze specific individual
row = 1
β = coef(model)
vcov_matrix = vcov(model)
link = LogitLink()
# Evaluate model rows
X_control = modelrow(compiled, data_control, row)
X_drug = modelrow(compiled, data_drug, row)
Δ = X_drug .- X_control
# Linear scale (log-odds) with uncertainty
effect_eta = dot(β, Δ)
∇β_eta = Δ
se_eta = sqrt(dot(∇β_eta, vcov_matrix, ∇β_eta))
println("Log-odds effect: $effect_eta ± $se_eta")
# Response scale (probability) with uncertainty
η_control = dot(β, X_control)
η_drug = dot(β, X_drug)
μ_control = GLM.linkinv(link, η_control) # Probability if Control
μ_drug = GLM.linkinv(link, η_drug) # Probability if Drug
effect_mu = μ_drug - μ_control
# Gradient for response scale (chain rule)
g_prime_control = GLM.mueta(link, η_control)
g_prime_drug = GLM.mueta(link, η_drug)
∇β_mu = g_prime_drug .* X_drug .- g_prime_control .* X_control
se_mu = sqrt(dot(∇β_mu, vcov_matrix, ∇β_mu))
println("Probability effect: $effect_mu ± $se_mu")
println(" P(outcome=1|Control) = $μ_control")
println(" P(outcome=1|Drug) = $μ_drug")Alternative: Using Dictionaries
For ad-hoc modifications without merge:
# Create modified data manually
data_control = (
x = data.x,
y = data.y,
treatment = fill("Control", length(data.y)) # Override one column
)
data_drug = (
x = data.x,
y = data.y,
treatment = fill("Drug_A", length(data.y))
)
X_control = modelrow(compiled, data_control, row)
X_drug = modelrow(compiled, data_drug, row)
Δ = X_drug .- X_control
# With gradient
β = coef(model)
effect = dot(β, Δ)
se = sqrt(dot(Δ, vcov(model), Δ))Summary: Simple vs Zero-Allocation
| Aspect | Simple Approach | ContrastEvaluator |
|---|---|---|
| Ease of use | Very simple | Requires setup |
| Code clarity | Clear intent | More abstraction |
| Memory per contrast | ~8KB | 0 bytes |
| Best for | 1-10 contrasts | 100+ contrasts |
| Setup cost | None | ~50μs + ~50KB |
| Learning curve | Minimal | Moderate |
Recommendation: Start with the simple approach for exploration. Switch to ContrastEvaluator when:
- You need to compute 100+ contrasts
- Performance/memory is critical
- Building production pipelines
CounterfactualVector Mechanism
The ContrastEvaluator achieves the same counterfactual comparison as the simple approach, but with zero allocations. It uses CounterfactualVectors to efficiently substitute values for a single row without copying data.
Key insight: Like the simple approach, this compares the same row under different categorical levels. The only difference is performance optimization, not the analytical concept.
Core Concept
A CounterfactualVector wraps an original data column and intercepts access to a specific row, returning a replacement value instead of the original:
# Original data column
data.treatment = ["Control", "Drug_A", "Drug_B", "Control", ...]
# Create CounterfactualVector
cf_vec = CategoricalCounterfactualVector(data.treatment, row=1, replacement="Drug_A")
# Behavior:
cf_vec[1] # → "Drug_A" (counterfactual - substituted value)
cf_vec[2] # → "Drug_A" (original value)
cf_vec[3] # → "Drug_B" (original value)
cf_vec[4] # → "Control" (original value)Implementation
All CounterfactualVector types implement this interface:
abstract type CounterfactualVector{T} <: AbstractVector{T} end
# Generic getindex implementation
@inline Base.getindex(v::CounterfactualVector, i::Int) =
(i == v.row ? v.replacement : v.base[i])Key insight: This is just a conditional branch—no array copying, no allocations.
Typed Variants
Different data types have specialized implementations:
# For numeric variables (Float64, Int64, etc.)
mutable struct NumericCounterfactualVector{T<:Real} <: CounterfactualVector{T}
base::Vector{T} # Original data
row::Int # Row index to override
replacement::T # Replacement value
end
# For categorical variables
mutable struct CategoricalCounterfactualVector{T,R} <: CounterfactualVector{CategoricalValue{T,R}}
base::CategoricalArray{T,1,R}
row::Int
replacement::CategoricalValue{T,R}
end
# For boolean variables
mutable struct BoolCounterfactualVector <: CounterfactualVector{Bool}
base::Vector{Bool}
row::Int
replacement::Bool
endType stability: Each variant has concrete types, enabling compiler optimizations.
Mutable Updates
The vectors are mutable so fields can be updated without allocations:
# Update which row to override
cf_vec.row = 5
# Update the replacement value
cf_vec.replacement = "Drug_B"
# No allocations - just field assignmentsThis enables reusing the same CounterfactualVector for multiple contrasts.
ContrastEvaluator Initialization
Construction Process
When you create a ContrastEvaluator, it builds the complete counterfactual infrastructure:
evaluator = contrastevaluator(compiled, data, [:treatment, :education, :female])Step 1: Build Internal Counterfactual Structure
The ContrastEvaluator internally creates CounterfactualVector wrappers for specified variables:
# Internal implementation creates:
# - A tuple of typed `CounterfactualVector` objects (one per variable)
# - A `NamedTuple` that merges original data with `CounterfactualVector` wrappersExample structure:
# Original data
data = (
x = [1.0, 2.0, 3.0, ...],
outcome = [0.5, 1.2, 0.8, ...],
treatment = ["Control", "Drug_A", "Drug_B", ...],
education = ["HS", "College", "HS", ...],
female = [0, 1, 1, 0, ...]
)
# Internal data_counterfactual structure
data_counterfactual = (
x = data.x, # Original (not in vars)
outcome = data.outcome, # Original (not in vars)
treatment = counterfactuals[1], # CategoricalCounterfactualVector
education = counterfactuals[2], # CategoricalCounterfactualVector
female = counterfactuals[3] # NumericCounterfactualVector{Float64}
)Key insight: Variables not in vars use original columns; variables in vars are wrapped in CounterfactualVectors.
Step 2: Pre-compute Categorical Level Mappings
# Lines 106-122 in contrast_evaluator.jl
categorical_level_maps = Dict{Symbol, Dict{String, CategoricalValue}}()
for (i, var) in enumerate(vars)
cf_vec = counterfactuals[i]
if cf_vec isa CategoricalCounterfactualVector
# Build String → CategoricalValue mapping
level_map = Dict{String, CategoricalValue}()
base_array = cf_vec.base
for level_str in levels(base_array)
# Find a CategoricalValue instance for this level
matching_idx = findfirst(x -> string(x) == level_str, base_array)
if matching_idx !== nothing
level_map[level_str] = base_array[matching_idx]
end
end
categorical_level_maps[var] = level_map
end
endPurpose: Pre-computing these mappings avoids allocations during contrast evaluation. Converting strings like "Control" to CategoricalValue objects normally allocates, but looking up in a pre-built Dict does not.
Example:
categorical_level_maps[:treatment] = Dict(
"Control" => CategoricalValue("Control", pool),
"Drug_A" => CategoricalValue("Drug_A", pool),
"Drug_B" => CategoricalValue("Drug_B", pool)
)Step 3: Detect Binary Variables
# Lines 124-141 in contrast_evaluator.jl
binary_vars = Set{Symbol}()
binary_coef_indices = Dict{Symbol, Int}()
for (i, var) in enumerate(vars)
cf_vec = counterfactuals[i]
col = getproperty(data, var)
if _is_truly_binary_variable(cf_vec, col)
binary_vars = union(binary_vars, [var])
# Find coefficient index for fast path optimization
coef_idx = _find_binary_coefficient_index(compiled, var)
if coef_idx !== nothing
binary_coef_indices[var] = coef_idx
end
end
endPurpose: Binary variables (0/1, true/false) have a fast path—the contrast is simply ±1 at the coefficient position, skipping full model evaluation.
Step 4: Allocate Buffers
# Lines 143-157 in contrast_evaluator.jl
ContrastEvaluator(
compiled,
vars,
data_counterfactual,
counterfactuals,
Vector{Float64}(undef, length(compiled)), # y_from_buf
Vector{Float64}(undef, length(compiled)), # y_to_buf
categorical_level_maps,
binary_vars,
binary_coef_indices,
Vector{Float64}(undef, length(compiled)), # gradient_buffer
Vector{Float64}(undef, length(compiled)), # xrow_from_buf
Vector{Float64}(undef, length(compiled)), # xrow_to_buf
1 # row
)One-time cost: All memory allocation happens during construction. Runtime evaluation reuses these buffers.
Contrast Computation Flow
Overview
Computing a contrast involves:
- Update counterfactual to "from" level → evaluate → store result
- Update counterfactual to "to" level → evaluate → store result
- Compute difference
Let's trace through an example step-by-step.
Example Call
contrast_modelrow!(Δ, evaluator, row=1, :treatment, "Control", "Drug")This computes: Δ = X(treatment="Drug") - X(treatment="Control") for row 1.
Step 1: Update Counterfactual to "from" Level
# Line 213 in contrast_evaluator.jl
update_counterfactual_for_var!(
evaluator.counterfactuals, # Tuple of CounterfactualVectors
evaluator.vars, # [:treatment, :education, :female]
:treatment, # Variable to modify
row, # Row index = 1
"Control", # Baseline level
evaluator.categorical_level_maps # Pre-computed mappings
)What update_counterfactual_for_var! does (lines 385-400 in typed_overrides.jl):
# 1. Find the CounterfactualVector for :treatment
cf_vec = get_counterfactual_for_var(counterfactuals, vars, :treatment)
# → Returns counterfactuals[1] (CategoricalCounterfactualVector for :treatment)
# 2. Update which row it's overriding
update_counterfactual_row!(cf_vec, row)
# → Sets cf_vec.row = 1
# 3. Use pre-computed mapping to get CategoricalValue (zero allocations!)
level_map = categorical_level_maps[:treatment]
replacement_str = string("Control") # → "Control"
cat_val = level_map[replacement_str] # Dict lookup - no allocation!
# → Gets CategoricalValue("Control") from pre-built map
# 4. Update the replacement value (mutable field assignment)
update_counterfactual_replacement!(cf_vec, cat_val)
# → Sets cf_vec.replacement = CategoricalValue("Control")Result: The CounterfactualVector for :treatment now behaves as:
data_counterfactual.treatment[1] # → "Control" (counterfactual)
data_counterfactual.treatment[2] # → "Drug_A" (original)
data_counterfactual.treatment[3] # → "Drug_B" (original)
# ... all other rows return original valuesStep 2: Evaluate Formula with "from" Level
# Line 214 in contrast_evaluator.jl
evaluator.compiled(evaluator.y_from_buf, evaluator.data_counterfactual, row)What happens:
- The compiled formula evaluates row 1 using
data_counterfactual - When it accesses
data_counterfactual.treatment[1], it gets"Control"(the counterfactual) - All other variables use their original values (no substitution)
- Result:
y_from_bufcontains the model matrix row X₀ withtreatment="Control"
Visualization:
Original data row 1: treatment="Drug_A", education="HS", x=1.5
Counterfactual row 1: treatment="Control", education="HS", x=1.5
↑
substituted valueStep 3: Update Counterfactual to "to" Level
# Line 216 in contrast_evaluator.jl
update_counterfactual_for_var!(
evaluator.counterfactuals,
evaluator.vars,
:treatment,
row,
"Drug", # Now set to comparison level
evaluator.categorical_level_maps
)Same process as Step 1, but now:
cf_vec.row = 1 # Same row
cf_vec.replacement = CategoricalValue("Drug") # Different levelResult: The same CounterfactualVector now returns "Drug" for row 1.
Step 4: Evaluate Formula with "to" Level
# Line 217 in contrast_evaluator.jl
evaluator.compiled(evaluator.y_to_buf, evaluator.data_counterfactual, row)Result: y_to_buf contains the model matrix row X₁ with treatment="Drug"
Step 5: Compute Discrete Effect
# Line 219 in contrast_evaluator.jl
Δ .= evaluator.y_to_buf .- evaluator.y_from_bufResult: Δ = X₁ - X₀ (the discrete effect vector)
This is the contrast vector showing how each coefficient contributes to the treatment effect.
Complete Flow Diagram
┌─────────────────────────────────────────────────────────────────┐
│ 1. Update CounterfactualVector: treatment[1] → "Control" │
│ - Mutable field assignment: cf_vec.replacement = "Control" │
│ - Uses pre-computed categorical mapping (0 allocations) │
└───────────────────────┬─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ 2. Evaluate formula with data_counterfactual │
│ - compiled(y_from_buf, data_counterfactual, row=1) │
│ - Formula accesses data_counterfactual.treatment[1] │
│ - CounterfactualVector returns "Control" (not original) │
│ - Result: y_from_buf = X₀ (model row with Control) │
└───────────────────────┬─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ 3. Update CounterfactualVector: treatment[1] → "Drug" │
│ - Mutable field assignment: cf_vec.replacement = "Drug" │
│ - Same vector, just update replacement field (0 allocations) │
└───────────────────────┬─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ 4. Evaluate formula with data_counterfactual (again) │
│ - compiled(y_to_buf, data_counterfactual, row=1) │
│ - CounterfactualVector now returns "Drug" │
│ - Result: y_to_buf = X₁ (model row with Drug) │
└───────────────────────┬─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ 5. Compute difference │
│ - Δ .= y_to_buf .- y_from_buf │
│ - Result: Δ = X₁ - X₀ (discrete effect) │
└─────────────────────────────────────────────────────────────────┘Gradient Computation
Purpose
Parameter gradients enable uncertainty quantification via the delta method:
# Compute gradient ∇β where ∇β[i] = ∂(discrete_effect)/∂β[i]
contrast_gradient!(∇β, evaluator, row, :treatment, "Control", "Drug", β)
# Delta method standard error
se = sqrt(∇β' * vcov_matrix * ∇β)Linear Scale Gradients
For discrete effects on the linear predictor scale η = Xβ:
discrete_effect = η₁ - η₀ = (X₁'β) - (X₀'β) = (X₁ - X₀)'βThe gradient is simply:
∇β = ΔX = X₁ - X₀Implementation (lines 786-796 in contrast_evaluator.jl):
# Update to "from" level and evaluate
update_counterfactual_for_var!(evaluator.counterfactuals, evaluator.vars, var, row, from, ...)
evaluator.compiled(evaluator.xrow_from_buf, evaluator.data_counterfactual, row)
# → xrow_from_buf = X₀
# Update to "to" level and evaluate
update_counterfactual_for_var!(evaluator.counterfactuals, evaluator.vars, var, row, to, ...)
evaluator.compiled(evaluator.xrow_to_buf, evaluator.data_counterfactual, row)
# → xrow_to_buf = X₁
# Gradient is the contrast vector
∇β .= xrow_to_buf .- xrow_from_buf # ∇β = X₁ - X₀Response Scale Gradients
For discrete effects on the response scale μ = g⁻¹(η):
discrete_effect = μ₁ - μ₀ = g⁻¹(η₁) - g⁻¹(η₀)By the chain rule:
∇β = g'(η₁) × X₁ - g'(η₀) × X₀Where g'(η) = dμ/dη is the link function derivative.
Implementation (lines 825-842 in contrast_evaluator.jl):
# Compute X₀ and η₀ = X₀'β
update_counterfactual_for_var!(...)
evaluator.compiled(evaluator.xrow_from_buf, evaluator.data_counterfactual, row)
η₀ = dot(β, evaluator.xrow_from_buf)
# Compute X₁ and η₁ = X₁'β
update_counterfactual_for_var!(...)
evaluator.compiled(evaluator.xrow_to_buf, evaluator.data_counterfactual, row)
η₁ = dot(β, evaluator.xrow_to_buf)
# Compute link function derivatives
g_prime_η₀ = _dmu_deta(link, η₀) # dμ/dη at η₀
g_prime_η₁ = _dmu_deta(link, η₁) # dμ/dη at η₁
# Apply chain rule
∇β .= g_prime_η₁ .* xrow_to_buf .- g_prime_η₀ .* xrow_from_bufSupported Link Functions
All GLM link functions are supported:
IdentityLink: g'(η) = 1LogLink: g'(η) = exp(η)LogitLink: g'(η) = exp(η) / (1 + exp(η))²ProbitLink: g'(η) = φ(η) (standard normal PDF)CloglogLink,CauchitLink,InverseLink,SqrtLink, etc.
Performance Characteristics
Zero Allocations Achieved Through
Pre-allocated buffers
- All output buffers allocated once during construction
- Reused across all contrast computations
- Buffers:
y_from_buf,y_to_buf,xrow_from_buf,xrow_to_buf,gradient_buffer
Mutable CounterfactualVectors
- Update fields in-place:
cf_vec.row = new_row,cf_vec.replacement = new_value - No array copying or temporary allocations
- Type-stable concrete types enable compiler optimizations
- Update fields in-place:
Pre-computed categorical mappings
- String → CategoricalValue lookups cached at construction
- Dictionary lookups don't allocate (just pointer access)
- Avoids repeated level searches in CategoricalArray
Type specialization
- All CounterfactualVector types are concrete (not abstract)
- Compiled formula is fully type-specialized
- No runtime dispatch on hot paths
Memory Efficiency Comparison
Traditional approach (allocates O(n) memory):
# Copy entire dataset and modify
data_control = deepcopy(data)
data_control.treatment .= "Control" # Allocate new column!
X₀ = modelmatrix(formula, DataFrame(data_control))
data_drug = deepcopy(data)
data_drug.treatment .= "Drug" # Another allocation!
X₁ = modelmatrix(formula, DataFrame(data_drug))
Δ = X₁[row, :] - X₀[row, :]Memory used: ~2n + O(modelmatrixsize)
CounterfactualVector approach (O(1) memory):
# Create evaluator once (one-time cost)
evaluator = contrastevaluator(compiled, data, [:treatment])
Δ = Vector{Float64}(undef, length(compiled))
# Compute contrast (zero allocations)
contrast_modelrow!(Δ, evaluator, row, :treatment, "Control", "Drug")Memory used: ~32 bytes (for mutable field updates)
Savings: >99.999% memory reduction for large datasets
Timing Benchmarks
From test suite (test/test_contrast_evaluator.jl):
# Construction (one-time cost)
@benchmark contrastevaluator($compiled, $data, [:treatment, :education])
# Typical: ~10-50μs, <50KB allocated
# Contrast computation (repeated operations)
@benchmark contrast_modelrow!($Δ, $evaluator, 1, :treatment, "Control", "Drug")
# Typical: ~50-200ns, 0 bytes allocated
# Gradient computation
@benchmark contrast_gradient!($∇β, $evaluator, 1, :treatment, "Control", "Drug", $β)
# Typical: ~100-300ns, 0 bytes allocatedBatch Processing
Processing multiple contrasts reuses the same buffers:
evaluator = contrastevaluator(compiled, data, [:treatment])
Δ = Vector{Float64}(undef, length(compiled))
# Zero allocations for all iterations
for row in 1:1000
contrast_modelrow!(Δ, evaluator, row, :treatment, "Control", "Drug")
# Process Δ...
endPerformance: Constant memory usage regardless of number of contrasts.
Advanced Topics
Binary Variable Fast Path
Binary variables (0/1, true/false) have optimized computation that skips formula evaluation:
# For binary variables, the contrast is simply ±1 at the coefficient position
function _contrast_modelrow_binary_fast_path!(Δ, evaluator, var, from, to)
coef_idx = evaluator.binary_coef_indices[var]
# Zero out all coefficients
@inbounds for i in eachindex(Δ)
Δ[i] = 0.0
end
# Set single non-zero element
contrast_direction = _binary_contrast_direction(from, to) # ±1.0
@inbounds Δ[coef_idx] = contrast_direction
return Δ
endSpeedup: ~10x faster than general path for binary variables.
Multiple Variables
The same evaluator can process different variables:
evaluator = contrastevaluator(compiled, data, [:treatment, :region, :education])
Δ_treatment = Vector{Float64}(undef, length(compiled))
Δ_region = Vector{Float64}(undef, length(compiled))
# Different variables, same evaluator, same buffers
contrast_modelrow!(Δ_treatment, evaluator, 1, :treatment, "Control", "Drug")
contrast_modelrow!(Δ_region, evaluator, 1, :region, "North", "South")Memory: O(1) regardless of number of variables or contrasts.
Integration with Compiled Formula
The compiled formula is completely unaware that it's using CounterfactualVectors:
# Inside the compiled formula evaluation
function (compiled::UnifiedCompiled)(output, data, row)
# Access column (might be CounterfactualVector)
treatment_val = data.treatment[row] # Dispatches to getindex
# CounterfactualVector returns replacement value if row matches
# Otherwise returns original value
# Formula sees no difference!
endTransparency: The AbstractVector interface makes CounterfactualVectors indistinguishable from regular vectors to the formula compiler.
Type Stability Validation
All CounterfactualVector types maintain concrete element types:
# Numeric: Float64 in, Float64 out
cf_numeric::NumericCounterfactualVector{Float64}
eltype(cf_numeric) === Float64 # ✓
# Categorical: CategoricalValue in, CategoricalValue out
cf_cat::CategoricalCounterfactualVector{String, UInt32}
eltype(cf_cat) === CategoricalValue{String, UInt32} # ✓
# Boolean: Bool in, Bool out
cf_bool::BoolCounterfactualVector
eltype(cf_bool) === Bool # ✓Result: No type instabilities, no runtime dispatch, full compiler optimization.
Summary
The categorical contrast system achieves zero-allocation performance through:
- CounterfactualVectors: Mutable wrappers that intercept single-row access
- Pre-computed mappings: Categorical level lookups cached at construction
- Buffer reuse: All output arrays allocated once, reused for all contrasts
- Type specialization: Concrete types throughout enable compiler optimization
The pattern:
Construct evaluator → Update counterfactual → Evaluate → Update counterfactual → Evaluate → DifferenceAll with zero allocations after the one-time construction cost, enabling efficient batch processing of thousands of contrasts without memory overhead.