I ported Anthropic's official skill-creator from Claude Code to OpenCode — now you can create and evaluate AI agent skills with any model

Reddit r/LocalLLaMA / 4/11/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

A developer open-sourced an “eval-driven” AI agent skill creator that ports Anthropic’s official Claude Code skill-creator to OpenCode using TypeScript.
The tool supports guided skill creation via an intake interview, automatically generates eval sets (should-trigger/should-not-trigger prompts), and measures trigger accuracy by comparing runs with and without the skill.
It iteratively optimizes skill descriptions using an LLM loop with a train/test split (up to five iterations), and provides an HTML viewer plus variance/benchmark reporting for human review.
Because it is designed to work with OpenCode, it can evaluate and develop skills using any of OpenCode’s 300+ models, including locally hosted models.
Installation is offered via an npm one-command workflow, with the project released under an Apache 2.0 license and attributed to Anthropic’s original approach.

I ported Anthropic's official skill-creator from Claude Code to OpenCode — now you can create and evaluate AI agent skills with any model

Hey r/LocalLLaMA — I open-sourced a tool that brings eval-driven development to AI agent skills. It's based on Anthropic's official skill-creator for Claude Code, but rewritten in TypeScript to work with OpenCode (which supports 300+ models including local ones).

The problem: creating skills for AI agents is trial-and-error. You write a skill, test it manually, and hope it triggers on the right prompts. There's no systematic way to measure if a skill works.

What this does:

Guided skill creation with an intake interview
Auto-generates eval test sets (should-trigger and should-not-trigger queries)
Runs evals with and without the skill to measure trigger accuracy
Optimizes skill descriptions through an iterative LLM loop (60/40 train/test split, up to 5 iterations)
Visual HTML eval viewer for human review
Benchmarks with variance analysis across iterations

The most interesting part for this community: it works with any of OpenCode's supported models. If you're running local models through OpenCode, you can use this tool with them.

One-command install:

npx opencode-skill-creator install --global

Apache 2.0 license. Based on Anthropic's skill-creator with attribution.

GitHub: https://github.com/antongulin/opencode-skill-creator

npm: https://www.npmjs.com/package/opencode-skill-creator

Happy to answer questions about the eval methodology, local model support, or architecture.

submitted by /u/antonusaca
[link] [comments]

Black Hat USA

AI Business

Black Hat Asia

AI Business

i'm amazed how happily blind everyone seems to be

Reddit r/artificial

Building a Plugin Marketplace for AI-Native Workflows

Dev.to

Why Your AI Agents are Burning Cash (And How to Fix It in 3 Minutes)

Dev.to

I ported Anthropic's official skill-creator from Claude Code to OpenCode — now you can create and evaluate AI agent skills with any model

Key Points

Related Articles

Black Hat USA

Black Hat Asia

i'm amazed how happily blind everyone seems to be

Building a Plugin Marketplace for AI-Native Workflows

Why Your AI Agents are Burning Cash (And How to Fix It in 3 Minutes)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer