Chinese-SkillSpan: A Span-Level Dataset for ESCO-Aligned Competency Extraction from Chinese Job Ads
arXiv cs.CL / 4/28/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- The paper introduces Chinese-SkillSpan, the first Chinese JobSkillNER dataset specifically for recruitment texts, designed to extract job skills from Chinese job ads.
- It defines Chinese-tailored annotation guidelines and uses an LLM-empowered macro–micro collaborative labeling pipeline, with expert sentence-level adjudication to refine initial outputs.
- The authors annotated over 20,000 instances from four major recruitment platforms covering 2014–2025, aligning the dataset with the ESCO occupational skill standard across four competency dimensions.
- Experiments indicate the dataset is suitable for effective model training and evaluation, aiming to fill a major resource gap and serve as a benchmark for intelligent recruitment research.
- Code and data are publicly released to support further research and reproducibility.
Related Articles
LLMs will be a commodity
Reddit r/artificial
HubSpot Just Legitimized AEO: What It Means for Your Brand AI Visibility
Dev.to

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally
Reddit r/LocalLLaMA

From Fault Codes to Smart Fixes: How Google Cloud NEXT ’26 Inspired My AI Mechanic Assistant
Dev.to

Dex lands $5.3M to grow its AI-driven talent matching platform
Tech.eu