Model Validation
Our factor analysis models undergo rigorous statistical validation to ensure reliable, unbiased results. This page describes our validation methodology and current model performance.
Validation Framework
We employ a comprehensive validation framework that meets actuarial and statistical best practices:
Statistical Significance Testing
Every factor included in our models is tested for statistical significance. We compute:
- Pearson correlation with the target variable (log RoL)
- P-values to confirm significance at the 0.05 level
- Effect direction validation to ensure factors behave as expected actuarially
Factors that do not meet statistical significance thresholds across models are removed or flagged.
Multicollinearity Analysis
We compute Variance Inflation Factor (VIF) scores for all features to detect multicollinearity. High multicollinearity can cause unstable attributions, so we:
- Flag features with VIF > 10 (high multicollinearity)
- Monitor features with VIF 5-10 (moderate multicollinearity)
- Document known correlations that reflect business realities
Cross-Validation
All models use 5-fold cross-validation to assess generalization:
- Data is split into 5 equal parts
- Each fold is used once as validation data
- R² is computed for each fold
- Mean and standard deviation across folds indicate stability
Low standard deviation indicates the model performs consistently across different data subsets.
Residual Analysis
We analyze model residuals to detect systematic issues:
- Bias check: Mean residual should be near zero
- Heteroscedasticity: Error variance should be roughly constant
- Distribution: Residuals should be approximately symmetric
Current Model Performance
Cross-Validation Results
| Model | R² (CV) | CV Std Dev | Interpretation |
|---|---|---|---|
| Package | 0.52 | 0.027 | Moderate-strong explanatory power |
| Property | 0.58 | 0.013 | Strong explanatory power |
| Liability | 0.68 | 0.008 | Strong explanatory power |
What these numbers mean:
- R² of 0.52-0.68 indicates the models explain 52-68% of RoL variation
- Low standard deviation (0.008-0.027) shows stable, consistent performance
- These values are typical for insurance pricing models with diverse portfolios
Bias Assessment
All models pass our bias checks with mean prediction error below 0.2%:
| Model | Mean Error | Status |
|---|---|---|
| Package | +0.08% | Unbiased |
| Property | -0.18% | Unbiased |
| Liability | +0.03% | Unbiased |
This confirms the models do not systematically over- or under-predict RoL.
Error Variance Stability
We verify that prediction errors are consistent across the RoL range:
| Model | Variance Ratio | Status |
|---|---|---|
| Package | 1.13 | Stable |
| Property | 2.11 | Moderate variation |
| Liability | 1.22 | Stable |
The Property model shows somewhat higher error variance for low-RoL predictions. This is documented and users should interpret low-RoL property results with appropriate caution.
Early Stopping and Overfitting Prevention
Our models use aggressive regularization and early stopping to prevent overfitting:
- Shallow trees (max depth 3) limit model complexity
- Minimum samples per leaf (10) prevents fitting to noise
- Row and column subsampling (60%) adds randomness
- L1/L2 regularization smooths predictions
- Early stopping halts training when validation performance plateaus
The gap between training and cross-validation R² (7-8%) indicates good generalization without significant overfitting.
Factor Significance Summary
Package Model (20 factors)
- 19 factors statistically significant (p < 0.05)
- 1 factor borderline (retained for consistency across models)
Property Model (17 factors)
- 13 factors statistically significant (p < 0.05)
- 4 factors borderline or model-specific
Liability Model (15 factors)
- 14 factors statistically significant (p < 0.05)
- 1 factor borderline
Continuous Monitoring
We continuously monitor model performance:
- Retraining as new data becomes available
- Performance tracking across retraining cycles
- Feature distribution monitoring to detect data drift
- Importance stability checks to ensure consistent rankings
Transparency and Limitations
We believe in transparent reporting. Known limitations include:
- Property model shows higher error variance for low-RoL predictions
- Some factors correlate with each other due to business realities (e.g., product types determine coverage inclusions)
- Results reflect statistical associations, not proven causal relationships
These limitations are documented and factored into our guidance for interpreting results.