LabelSets — open quality standard for AI training data (LQS v3.1) [D]

Reddit r/MachineLearning / 4/27/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • LabelSets introduces LQS v3.1, an open quality standard for AI/ML training data that includes dataset rating via multiple scoring oracles across several algorithm families.
  • The approach uses conformal prediction intervals to estimate downstream F1 performance, along with Ed25519-signed certificates and a contamination check against 40+ public evaluations.
  • Users can freely audit datasets by pasting any Hugging Face dataset URL, and can verify certificates publicly via an unauthenticated verification API endpoint.
  • The calibration corpus is around 1,000 datasets today and is targeted to reach about 10,000 by Q3 2026, with certification explicitly indicating when calibration is sparse rather than overstating confidence.
  • The authors invite feedback on label dimensions, oracle agreement statistics (Cohen and Fleiss κ), and conformal calibration, and publish a CC BY 4.0 methodology paper with the full specification.

Built a third-party quality rating system for ML datasets. Multi-oracle (7 scorers across 5 algorithm families), conformal prediction intervals on downstream F1, Ed25519-signed certs, and a contamination check against 40+ public evals (MMLU, HumanEval, GSM8K, MedQA, LegalBench, etc.).

Methodology paper, CC BY 4.0: https://labelsets.ai/paper

Free audit (paste any HF dataset URL): https://labelsets.ai/rate

Public verification API, no auth: GET /api/verify-lqs-cert/:hash

Calibration corpus is at ~1,000 datasets and growing toward 10,000 by Q3 2026 — where calibration is thin, the cert says so out loud rather than fabricating confidence.

Happy to take feedback on the dimension list, the oracle agreement math (Cohen + Fleiss κ reporting), or the conformal prediction calibration. The methodology paper has the full spec — anywhere we got the math wrong, we want to know.

submitted by /u/plomii
[link] [comments]