CRC-Screen: Certified DNA-Synthesis Hazard Screening Under Taxonomic Shift

arXiv cs.AI / 5/4/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The paper finds that common DNA hazard screening based on sequence matching can fail catastrophically (up to a 100% false-flag/miss behavior) when hazardous sequences come from taxonomic families missing from the reference set.
It proposes “CRC-Screen,” which uses multiple order-derived signals—k-mer Jaccard similarity to known toxins, a trimmed-mean score from a 5-LLM judge panel, and cosine similarity to embedding centroids—then fuses them via a monotone logistic aggregator.
Using Conformal Risk Control (CRC), the screener provides statistical guarantees on the expected false-negative rate, certifying E[FNR] ≤ α under a certified miss-rate constraint.
Experiments on ten leave-one-taxonomic-family-out folds (UniProt KW-0800 reviewed toxins, α=0.05) show 0% test miss rate on all folds and 0% test false-flag rate on 9 out of 10 folds.
The authors conclude that the main bottleneck for procurement-grade guarantees (e.g., α=10^-3) is the amount of calibration data, not the algorithm, and estimate an ~18× larger calibration set is needed relative to their 200-hazard subsample.

Abstract

DNA-synthesis providers screen incoming orders by searching the requested sequence against curated hazard lists. We show that this baseline collapses to a 100% false-flag rate when the hazardous sequence comes from a taxonomic family absent from the reference set: under Conformal Risk Control's certified miss-rate constraint, a low-discrimination signal forces the threshold below the entire test-benign mass. We compose three signals derived from a synthesis order's public annotation:

k

-mer Jaccard similarity to known toxins, the trimmed-mean score of a five-LLM judge panel, and cosine similarity to clustered embedding centroids. Fused under a monotone logistic aggregator and calibrated by Conformal Risk Control, the resulting screener certifies

\mathbb{E}[\mathrm{FNR}] \le \alpha

. Across ten leave-one-taxonomic-family-out folds at

\alpha=0.05

on UniProt KW-0800 reviewed toxins, the calibrated screener achieves 0% test miss rate on every fold and 0% test false-flag rate on nine of ten folds. The bound's finite-sample slack

1/(n_{\mathrm{cal}}+1)

caps the certifiable miss rate at 1.77% on our 200-hazard subsample; reaching procurement-grade

\alpha=10^{-3}

requires an

18\times

larger calibration set, which the full reviewed UniProt KW-0800 corpus is large enough to deliver. The binding constraint on certifiable DNA-synthesis screening is calibration data, not algorithms. Code: https://github.com/najmulhasan-code/crc-screen

Black Hat USA

AI Business

A very basic litmus test for LLMs "ok give me a python program that reads my c: and put names and folders in a sorted list from biggest to small"

Reddit r/LocalLLaMA

ALM on Power Platform: ADO + GitHub, the best of both worlds

Dev.to

Iron Will, Iron Problems: Kiwi-chan's Mining Misadventures! 🥝⛏️

Dev.to

Experiment: Does repeated usage influence ChatGPT 5.4 outputs in a RAG-like setup?

Dev.to

CRC-Screen: Certified DNA-Synthesis Hazard Screening Under Taxonomic Shift

Key Points

Abstract

Related Articles

Black Hat USA

A very basic litmus test for LLMs "ok give me a python program that reads my c: and put names and folders in a sorted list from biggest to small"

ALM on Power Platform: ADO + GitHub, the best of both worlds

Iron Will, Iron Problems: Kiwi-chan's Mining Misadventures! 🥝⛏️

Experiment: Does repeated usage influence ChatGPT 5.4 outputs in a RAG-like setup?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer