ArXiv TLDR

CRC-Screen: Certified DNA-Synthesis Hazard Screening Under Taxonomic Shift

🐦 Tweet
2605.00074

Najmul Hasan

q-bio.GNcs.AI

TLDR

CRC-Screen offers certified DNA-synthesis hazard screening, maintaining low miss and false-flag rates even under taxonomic shifts.

Key contributions

  • Baseline DNA hazard screening fails completely when new taxonomic families appear.
  • Proposes CRC-Screen, combining k-mer similarity, LLM scores, and embedding similarity.
  • Achieves 0% miss rate and near 0% false-flag rate under taxonomic shifts.
  • Demonstrates that calibration data size is the key bottleneck for certified screening.

Why it matters

Current DNA synthesis hazard screening methods are vulnerable to new biological threats. This paper introduces a robust, certified screening system that maintains high accuracy even with taxonomic shifts. It highlights that sufficient calibration data, not algorithmic complexity, is the primary hurdle for achieving procurement-grade safety.

Original Abstract

DNA-synthesis providers screen incoming orders by searching the requested sequence against curated hazard lists. We show that this baseline collapses to a 100% false-flag rate when the hazardous sequence comes from a taxonomic family absent from the reference set: under Conformal Risk Control's certified miss-rate constraint, a low-discrimination signal forces the threshold below the entire test-benign mass. We compose three signals derived from a synthesis order's public annotation: $k$-mer Jaccard similarity to known toxins, the trimmed-mean score of a five-LLM judge panel, and cosine similarity to clustered embedding centroids. Fused under a monotone logistic aggregator and calibrated by Conformal Risk Control, the resulting screener certifies $\mathbb{E}[\mathrm{FNR}] \le α$. Across ten leave-one-taxonomic-family-out folds at $α=0.05$ on UniProt KW-0800 reviewed toxins, the calibrated screener achieves 0% test miss rate on every fold and 0% test false-flag rate on nine of ten folds. The bound's finite-sample slack $1/(n_{\mathrm{cal}}+1)$ caps the certifiable miss rate at 1.77% on our 200-hazard subsample; reaching procurement-grade $α=10^{-3}$ requires an $18\times$ larger calibration set, which the full reviewed UniProt KW-0800 corpus is large enough to deliver. The binding constraint on certifiable DNA-synthesis screening is calibration data, not algorithms. Code: https://github.com/najmulhasan-code/crc-screen

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.