Pipeline Overview

Input Processing

Total Items1M entries
Multi-Sample Config3 observations × 5 completions
Storage FormatJSONL

Validation Layer

Label Validation§X.YZ format check
Cross-Reference15 entries/input
Taxonomy Match100% required

Hallucination Detection

Pattern Types4 categories
Detection Rate~5-8% flagged
False Positive<1% expected

Review Scoring

Score Range0.0 - 1.0
Review Threshold0.7
Target Review3-5%

Human Review

Review Budget30-50k items
Review InterfaceReact-based
TrackingFull audit trail

PRM Dataset

Final FormatPRM-ready
Quality ScoreValidated
Coverage100%

Stage Details

  • Parse incoming JSONL entries
  • Group by input_id
  • Validate file structure
Click on any stage to see detailed information

Examples & Details

Label Validation

Click to view examples

Hallucination Detection

Click to view examples

Review Score Calculation

Click to view examples

Multi-Sample Processing

Click to view examples

Label Validation

Examples of correct and incorrect label usage

Valid Leaf Node

§9.C1 - Suggestive content

Uses proper leaf node category

Invalid Parent Node

§9.C - Suggestive material

Uses parent category instead of leaf node

Non-existent Category

§11.E1 - Custom category

Category doesn't exist in taxonomy

Validation against taxonomy rules
Conformal Group Inc. // 2025 // Confidential
Made by Claude with