Previously generated synthetic protocols and uploaded source protocols available for re-processing
| Job ID | Status | Step | Progress | Created | |
|---|---|---|---|---|---|
job_c4591001_clinica... |
COMPLETED | DONE | 100% | 2026-03-22 07:43 | |
job_c4591001_clinica... |
COMPLETED | DONE | 100% | 2026-03-21 22:34 | |
job_c4591001_clinica... |
COMPLETED | DONE | 100% | 2026-03-21 21:42 | |
job_c4591001_clinica... |
COMPLETED | DONE | 100% | 2026-03-21 20:27 | |
job_prot_sap_000_202... |
COMPLETED | DONE | 100% | 2026-03-20 21:53 |
Current Cortex field registry: 238 fields across 15 modules. Uploads run the full Cortex pipeline by default, and the current rules are versioned from the canonical rule loaders.
Models: Cortex is controller-led. Tier 1 is narrow deterministic extraction (~120 fields). Tier 2 is manifest-gated learned ranking (~30 fields) — currently operating in lexical fallback mode (no trained CrossEncoder bundle). Tier 3 is the universal non-system LLM extractor with bounded self-critique (~82 fields), primary model: ministral3-14b (Ollama Cloud). Tier 4 is selective arbitration for hard fields, model: qwen3.5 (Ollama Cloud). Zero OpenAI.
Data: Protocol PDFs are processed with Docling-first extraction. The active path is Docling text + table, yielding an evidence pack with page and line IDs. PyMuPDF is fallback-only when Docling returns no usable lines. Synthetic ground truth comes from synthetic_protocols. Artifacts remain immutable in GCS and job state in Firestore.
Process: Upload or synthetic generation creates the job, the worker runs extraction and builds the evidence pack, Cortex applies hybrid semantic zoning and field planning, then bounded controller rounds execute Tier 1, Tier 3, and selective Tier 4 before validation emits design_output_v1. Schema, Mapping, Amendment, and Validator continue downstream. Progress, step timings, token usage, and estimated cost are exposed on GET /api/v1/jobs/{id}.
Synthetic scenarios (generator-native): BASELINE, DESIGN_CHALLENGE, SCHEMA_CHALLENGE, MAPPING_CHALLENGE, AMENDMENT_CHALLENGE, VALIDATOR_STRESS, FULL_STRESS, STATUS_CHALLENGE, NOISE_CHALLENGE — short path MODE_A except long-path for VALIDATOR_STRESS/FULL_STRESS.
Contract-first pipeline with 238 fields across 15 modules (rules vdefs:3.4 gates:3.0). Each agent consumes upstream artifact(s) and emits one versioned JSON output. Mapping and Amendment run in parallel after Schema. Full audit trail and cost tracking per step.
_1_design.json → _2_schema.json → _3_mapping.json + _4_amendment.json → _5_validated.json
Cortex processes the protocol with Docling-first extraction, builds an evidence pack, applies hybrid semantic zoning, and prepares field plans before entering bounded controller rounds. Tier 3 is the main non-system extractor, Tier 1 stays narrow, Tier 2 is manifest-gated, and Tier 4 is selective arbitration for hard fields. Validation and trace outputs stay explicit.
Deterministic normalization and CT mapping from Design records. Driven by Schema workbook rules and CDISC CT dictionary. Ownership modes: PASSTHROUGH, SINGLE-CT, UCUM, DERIVED.
Assembly engine: builds SDTM TS rows from schema output using the mapping matrix. Deterministic row builder (no AI). MVP: Study Definition + Study Design (31 fields).
Computes N vs N-1 field-level diff for same protocol ID with severity tagging. Change types: ADD/MODIFY/DELETE. Severities: MAJOR (CT code changed), MINOR (SDTM but no CT), COSMETIC.
Merges all upstream contracts into unified validated records with combined confidence, QC flags, and human review tasks. Two-stage confidence: verifier (design-dominant weighting) then gating (evidence + agreement + rules + extraction quality).
REST API (FastAPI + Uvicorn). All endpoints under /api/v1/.
Every extracted value is tied to evidence (page, section, quote, bounding box) and preserved through all downstream contracts with full rule and model trace.
Tier 3 is the universal non-system extractor, but Cortex stays evidence-driven. Deterministic rules validate outputs, Tier 1 stays narrow, and Tier 4 arbitrates hard disagreements instead of applying a fixed winner order.
Each agent reads upstream artifacts only. No downstream component re-reads raw text after Design Agent. Immutable artifacts in GCS; operational state in Firestore.
Low-confidence, conflicting, or blocked records are flagged with explanation and evidence for manual resolution. AI review tips assist but never auto-resolve.
Domain-aware batch sizing with parallel execution. Zone-filtered evidence reduces tokens 60-80%. Full cost tracking per agent and per API call.
Every pipeline step, review action, and status transition is logged as an audit event with timestamps, token usage, and cost breakdown.