CRC-Screen: Certified DNA-Synthesis Hazard Screening Under Taxonomic Shift
TLDR
CRC-Screen offers certified DNA-synthesis hazard screening, maintaining low miss and false-flag rates even under taxonomic shifts.
Key contributions
- Baseline DNA hazard screening fails completely when new taxonomic families appear.
- Proposes CRC-Screen, combining k-mer similarity, LLM scores, and embedding similarity.
- Achieves 0% miss rate and near 0% false-flag rate under taxonomic shifts.
- Demonstrates that calibration data size is the key bottleneck for certified screening.
Why it matters
Current DNA synthesis hazard screening methods are vulnerable to new biological threats. This paper introduces a robust, certified screening system that maintains high accuracy even with taxonomic shifts. It highlights that sufficient calibration data, not algorithmic complexity, is the primary hurdle for achieving procurement-grade safety.
Original Abstract
DNA-synthesis providers screen incoming orders by searching the requested sequence against curated hazard lists. We show that this baseline collapses to a 100% false-flag rate when the hazardous sequence comes from a taxonomic family absent from the reference set: under Conformal Risk Control's certified miss-rate constraint, a low-discrimination signal forces the threshold below the entire test-benign mass. We compose three signals derived from a synthesis order's public annotation: $k$-mer Jaccard similarity to known toxins, the trimmed-mean score of a five-LLM judge panel, and cosine similarity to clustered embedding centroids. Fused under a monotone logistic aggregator and calibrated by Conformal Risk Control, the resulting screener certifies $\mathbb{E}[\mathrm{FNR}] \le α$. Across ten leave-one-taxonomic-family-out folds at $α=0.05$ on UniProt KW-0800 reviewed toxins, the calibrated screener achieves 0% test miss rate on every fold and 0% test false-flag rate on nine of ten folds. The bound's finite-sample slack $1/(n_{\mathrm{cal}}+1)$ caps the certifiable miss rate at 1.77% on our 200-hazard subsample; reaching procurement-grade $α=10^{-3}$ requires an $18\times$ larger calibration set, which the full reviewed UniProt KW-0800 corpus is large enough to deliver. The binding constraint on certifiable DNA-synthesis screening is calibration data, not algorithms. Code: https://github.com/najmulhasan-code/crc-screen
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.