Evaluating Human-AI Safety: A Framework for Measuring Harmful Capability Uplift
arXiv cs.AI / 3/31/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that frontier AI safety evaluations should shift from static benchmarks and red-teaming toward human-centered measurements of risk.
- It proposes “harmful capability uplift” as a core metric, defined as the marginal increase in a user’s ability to cause harm when using a frontier model beyond what conventional tools already allow.
- The framework is grounded in prior social science research and includes methodological guidance for systematically measuring this uplift.
- The authors outline actionable next steps for developers, researchers, funders, and regulators to standardize harmful capability uplift evaluation.
Related Articles

Black Hat Asia
AI Business
[D] How does distributed proof of work computing handle the coordination needs of neural network training?
Reddit r/MachineLearning

Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside
Dev.to

BYOK is not just a pricing model: why it changes AI product trust
Dev.to

AI Citation Registries and Identity Persistence Across Records
Dev.to