Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

arXiv cs.AI / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces Graph of Skills (GoS), an inference-time structural retrieval layer designed to scale agent skill libraries to thousands of reusable skills without loading everything into the model context.
GoS builds an executable skill graph offline and then retrieves a dependency-aware, bounded set of skills at inference time using hybrid semantic–lexical seeding, reverse-weighted Personalized PageRank, and context-budgeted hydration.
Experiments on SkillsBench and ALFWorld show GoS improves average reward by 43.6% compared with a full-skill-loading baseline while cutting input tokens by 37.8%.
The method generalizes across multiple model families (Claude Sonnet, GPT-5.2 Codex, and MiniMax) and maintains strong performance across skill libraries sized 200–2,000 via ablation and scaling studies.
Overall, GoS targets the core scaling bottleneck of context saturation—reducing token cost, latency, and hallucination risk while preserving task performance.

Abstract

Skill usage has become a core component of modern agent systems and can substantially improve agents' ability to complete complex tasks. In real-world settings, where agents must monitor and interact with numerous personal applications, web browsers, and other environment interfaces, skill libraries can scale to thousands of reusable skills. Scaling to larger skill sets introduces two key challenges. First, loading the full skill set saturates the context window, driving up token costs, hallucination, and latency. In this paper, we present Graph of Skills (GoS), an inference-time structural retrieval layer for large skill libraries. GoS constructs an executable skill graph offline from skill packages, then at inference time retrieves a bounded, dependency-aware skill bundle through hybrid semantic-lexical seeding, reverse-weighted Personalized PageRank, and context-budgeted hydration. On SkillsBench and ALFWorld, GoS improves average reward by 43.6% over the vanilla full skill-loading baseline while reducing input tokens by 37.8%, and generalizes across three model families: Claude Sonnet, GPT-5.2 Codex, and MiniMax. Additional ablation studies across skill libraries ranging from 200 to 2,000 skills further demonstrate that GoS consistently outperforms both vanilla skills loading and simple vector retrieval in balancing reward, token efficiency, and runtime.

Black Hat Asia

AI Business

[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project

Reddit r/MachineLearning

ALTK‑Evolve: On‑the‑Job Learning for AI Agents

Hugging Face Blog

Context Windows Are Getting Absurd — And That's a Good Thing

Dev.to

Every AI Agent Registry in 2026, Compared

Dev.to

Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

Key Points

Abstract

Related Articles

Black Hat Asia

[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project

ALTK‑Evolve: On‑the‑Job Learning for AI Agents

Context Windows Are Getting Absurd — And That's a Good Thing

Every AI Agent Registry in 2026, Compared

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer