The AI Skills Shift: Mapping Skill Obsolescence, Emergence, and Transition Pathways in the LLM Era

arXiv cs.CL / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces the Skill Automation Feasibility Index (SAFI), benchmarking four frontier LLMs on 263 text-based tasks mapped to all 35 U.S. O*NET skills to estimate automation susceptibility of occupational skills.
  • It combines SAFI with real-world AI adoption data (from the Anthropic Economic Index) to build an “AI Impact Matrix” that categorizes skills into High Displacement Risk, Upskilling Required, AI-Augmented, and Lower Displacement Risk.
  • The findings indicate math and programming have the highest automation feasibility scores, while active listening and reading comprehension score lowest, implying uneven displacement pressure across skill types.
  • The study reports a “capability-demand inversion” (skills most demanded in AI-exposed jobs are where the benchmarked LLMs perform relatively poorly) and suggests that observed AI use is mostly augmentation (78.7%) rather than full automation.
  • It concludes that text-based automation feasibility appears more dependent on the skill itself than on the specific model, and notes SAFI measures LLM performance on text representations rather than complete job execution, with all data/code/model responses open-sourced.

Abstract

As Large Language Models reshape the global labor market, policymakers and workers need empirical data on which occupational skills may be most susceptible to automation. We present the Skill Automation Feasibility Index (SAFI), benchmarking four frontier LLMs -- LLaMA 3.3 70B, Mistral Large, Qwen 2.5 72B, and Gemini 2.5 Flash -- across 263 text-based tasks spanning all 35 skills in the U.S. Department of Labor's O*NET taxonomy (1,052 total model calls, 0% failure rate). Cross-referencing with real-world AI adoption data from the Anthropic Economic Index (756 occupations, 17,998 tasks), we propose an AI Impact Matrix -- an interpretive framework that positions skills along four quadrants: High Displacement Risk, Upskilling Required, AI-Augmented, and Lower Displacement Risk. Key findings: (1) Mathematics (SAFI: 73.2) and Programming (71.8) receive the highest automation feasibility scores; Active Listening (42.2) and Reading Comprehension (45.5) receive the lowest; (2) a "capability-demand inversion" where skills most demanded in AI-exposed jobs are those LLMs perform least well at in our benchmark; (3) 78.7% of observed AI interactions are augmentation, not automation; (4) all four models converge to similar skill profiles (3.6-point spread), suggesting that text-based automation feasibility may be more skill-dependent than model-dependent. SAFI measures LLM performance on text-based representations of skills, not full occupational execution. All data, code, and model responses are open-sourced.