Data Quality Controls and Validation

RoL-related fields receive heightened quality scrutiny due to their importance in analytics, benchmarking, and reporting.

Validation and enrichment are supported by a purpose-built AI extraction and validation system that has been iteratively improved through multiple human-in-the-loop (HITL) cycles. Where required, values are re-validated or re-extracted directly from filed source documentation.

Manual review by the data team is performed when automated validation is insufficient or where policy structure complexity warrants additional scrutiny.

Outlier Detection Methodology

RoL values are evaluated at the product level using a log-transformed interquartile range (IQR) methodology to establish expected inlier ranges.

The methodology operates as follows:

RoL values are log-transformed to reduce skew and stabilize variance
Quartiles (Q1 and Q3) are calculated on the transformed values
The interquartile range (IQR) is defined as Q3 − Q1
Lower and upper bounds are determined using an IQR-based multiplier
Observations are classified as inliers or outliers based on their position relative to these bounds
Bounds may be back-transformed to the original RoL scale for interpretability

Outlier Severity Bands

Outliers are categorized into severity bands based on their distance from the nearest bound:

Inlier — Within expected range
Mild — Up to 20% beyond the nearest bound
Moderate — Greater than 20% and up to 50% beyond the nearest bound
Extreme — Greater than 50% beyond the nearest bound

Review Process

Outliers are not automatically excluded. Instead, they enter a validation workflow that includes:

Automated re-validation via AI extraction against filed documentation
Manual review by the data team where required

Only records that pass validation are approved for inclusion in Market Terminal outputs.