OmniCompliance-100K: A Multi-Domain, Rule-Grounded, Real-World Safety Compliance Dataset
arXiv cs.CL / 3/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- OmniCompliance-100K is a large, rule-grounded safety dataset for LLMs, containing 12,985 rules and 106,009 associated real-world compliance cases.
- The dataset spans 74 regulations and policies across domains including security, privacy, content safety, financial security, medical device risk mgmt, educational integrity, and human rights protections.
- It was collected using a web-searching agent to ensure real-world relevance and addresses gaps in prior ad-hoc safety data taxonomies.
- Benchmarking experiments across different model scales reveal insights that can guide future LLM safety research and development.
Related Articles
I Extended the Trending mcp-brasil Project with AI Generation — Full Tutorial
Dev.to
The Rise of Self-Evolving AI: From Stanford Theory to Google AlphaEvolve and Berkeley OpenSage
Dev.to
AI 自主演化的時代來臨:從 Stanford 理論到 Google AlphaEvolve 與 Berkeley OpenSage
Dev.to
Neural Networks in Mobile Robot Motion
Dev.to
Retraining vs Fine-tuning or Transfer Learning? [D]
Reddit r/MachineLearning