npm audit は壊れている――それを直すために私が作った Claude Code のスキル

Dev.to / 2026/4/9

💬 オピニオンDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

要点

  • 記事は、`npm audit` が「設計によって壊れている」と主張する。具体的には、実際の悪用可能性ではなく脆弱性の存在を報告するため、開発者がほとんどの指摘をノイズとして無視してしまう、という。
  • 典型的な Next.js プロジェクトでは、高い重大度の警告の大半が devDependencies、または深い推移的依存関係(transitive dependencies)に由来し、本番環境で到達できる可能性が低い、と述べている。
  • 提案する解決策は、問題を分類して CRITICAL-RUNTIME、DEV-ONLY、TRANSITIVE-UNREACHABLE、CONDITIONAL-UNLIKELY の各バケットに分ける Claude Code のスキルである。
  • スキルのワークフローは、まず通常の監査ツール(audit)を実行し、その後、各脆弱な依存関係がどこで使われているかを分析して、アプリの実行環境において脆弱なコードパスや前提条件が実際にトリガー可能かどうかを判定する。

npm audit Is Broken — Here's the Claude Code Skill I Built to Fix It

Dan Abramov called it in 2023: npm audit is "broken by design." Run npm audit on any real project and you'll get 47 "high severity" warnings — 45 of which are in dev dependencies, transitive deps you never import, or vulnerabilities that require conditions your app never meets.

The result? Developers ignore npm audit entirely. The 2 real vulnerabilities hide in the noise.

I built a Claude Code skill that actually fixes this. Not by silencing warnings — by classifying them.

The Problem: Signal vs. Noise

Here's what npm audit gives you on a standard Next.js project:

found 23 vulnerabilities (4 moderate, 15 high, 4 critical)

Looks terrifying. But when you actually trace each one:

  • 18 are in devDependencies — never shipped to production
  • 3 are in transitive deps 4 levels deep — your code never calls them
  • 1 requires the attacker to control a specific header that your reverse proxy strips
  • 1 is a real, exploitable prototype pollution in a direct dependency

That last one matters. The other 22 are noise. But npm audit treats them all the same.

This is the "broken by design" problem. npm audit reports vulnerability existence, not vulnerability exploitability. It doesn't know your architecture, your deployment model, or your actual import graph.

The Fix: Classification-Based Auditing

Instead of "high/medium/low," my dependency auditor skill classifies each finding into one of four buckets:

1. CRITICAL-RUNTIME

The dependency is imported in production code. The vulnerability is triggerable through your app's actual execution paths. Fix immediately.

2. DEV-ONLY

The vulnerability is in a devDependency. Unless your CI/CD pipeline is the attack vector, this doesn't affect production. Fix at your convenience.

3. TRANSITIVE-UNREACHABLE

The vulnerable function exists in a transitive dependency, but your code never calls the vulnerable code path. Monitor but don't panic.

4. CONDITIONAL-UNLIKELY

The vulnerability requires specific conditions (certain input formats, disabled security headers, specific OS) that don't apply to your deployment. Acknowledge and document.

How It Works

The skill runs in three phases:

Phase 1: Run the Real Audit Tool

Don't reinvent the wheel. Run the ecosystem's native audit tool and capture its structured output:

# npm
npm audit --json > audit_results.json

# pip
pip-audit --format=json --output=audit_results.json

# cargo
cargo audit --json > audit_results.json

# Go
govulncheck -json ./... > audit_results.json

This gives us the raw vulnerability data: CVE IDs, affected versions, severity scores, and advisory descriptions.

Phase 2: Trace the Import Graph

This is where the value is. For each vulnerability, trace whether your code actually reaches it:

# Simplified logic — actual skill handles edge cases

def classify_vulnerability(vuln, project):
    pkg = vuln["package"]

    # Check: is this a dev dependency?
    if pkg in project.dev_dependencies:
        return "DEV-ONLY"

    # Check: is this a direct or transitive dependency?
    if pkg not in project.direct_dependencies:
        # Trace the import chain
        chain = project.trace_import_chain(pkg)
        if not chain.reaches_production_code():
            return "TRANSITIVE-UNREACHABLE"

    # Check: does the vulnerable function get called?
    vuln_functions = vuln.get("affected_functions", [])
    if vuln_functions:
        called = project.find_calls_to(pkg, vuln_functions)
        if not called:
            return "TRANSITIVE-UNREACHABLE"

    # Check: are the trigger conditions met?
    conditions = vuln.get("conditions", [])
    if conditions and not project.meets_conditions(conditions):
        return "CONDITIONAL-UNLIKELY"

    return "CRITICAL-RUNTIME"

Phase 3: Auto-Remediate What's Safe

For CRITICAL-RUNTIME findings, the skill attempts automatic fixes:

  1. Check if a patched version exists
  2. Verify the patch doesn't break your lockfile
  3. Run your test suite against the updated dependency
  4. If tests pass → apply the fix
  5. If tests fail → report the breaking change with the specific test failure

For DEV-ONLY findings, it applies fixes more aggressively since the blast radius is limited to your dev environment.

Real Example: Auditing a Next.js + Prisma App

I ran the skill on one of our production apps. Raw npm audit output:

31 vulnerabilities (8 moderate, 17 high, 6 critical)

After classification:

CRITICAL-RUNTIME:  1  (prototype pollution in qs@6.5.3)
DEV-ONLY:         19  (eslint plugins, testing libs)
TRANSITIVE-UNREACHABLE: 9  (deep transitive, unused code paths)
CONDITIONAL-UNLIKELY:   2  (requires specific Node.js flags)

Action items:
✅ Auto-fixed: qs upgraded to 6.13.0 (tests pass)
📋 19 dev deps: batch update scheduled
⚠️  2 conditional: documented in security-notes.md

31 scary warnings → 1 actual action item → auto-fixed in 12 seconds.

Multi-Ecosystem Support

The skill works across ecosystems because the classification logic is universal — only the audit tool and dependency resolution differ:

Ecosystem Audit Tool Lockfile Classification
npm/yarn/pnpm npm audit package-lock.json Same 4 buckets
Python pip-audit requirements.txt / poetry.lock Same 4 buckets
Rust cargo audit Cargo.lock Same 4 buckets
Go govulncheck go.sum Same 4 buckets

Go's govulncheck is actually the gold standard here — it already does call-graph analysis. The skill wraps it into the same classification format for consistency.

License Compliance (Bonus)

While scanning dependencies, the skill also checks licenses. This catches the "someone switched to AGPL" problem before it reaches production:

License audit:
✅ 847 packages: MIT, Apache-2.0, ISC, BSD-2/3
⚠️  2 packages: LGPL-3.0 (acceptable for dynamic linking)
🚫 0 packages: GPL, AGPL, SSPL (would require review)

SBOM Generation

The skill generates a Software Bill of Materials in CycloneDX or SPDX format. This is increasingly required for enterprise customers and government contracts:

# Generated automatically after each audit
output/sbom-cyclonedx.json  # CycloneDX 1.5
output/sbom-spdx.json       # SPDX 2.3

Why Not Just Use Snyk/Dependabot/Socket?

Those tools are good. But they share the same fundamental problem: they alert on existence, not exploitability.

Dependabot will open 23 PRs for dev dependency updates that don't affect production. Snyk will email you about a transitive dep 5 levels deep that your code never touches. Socket does better with supply-chain detection but doesn't classify by your actual usage.

The skill fills the gap: it runs after those tools and filters their output through your project's actual dependency graph.

Getting Started

If you use Claude Code, you can install the skill and run it on any project:

# Run the audit
/audit

# It automatically:
# 1. Detects your ecosystem (npm, pip, cargo, go)
# 2. Runs the native audit tool
# 3. Classifies each finding
# 4. Attempts auto-remediation for CRITICAL-RUNTIME
# 5. Generates the report

The full skill with multi-ecosystem support, auto-remediation, license compliance, and SBOM generation is available as a Claude Code skill on Gumroad ($9).

For simpler setups, you can build the classification logic yourself using the patterns above. The key insight is: stop treating all vulnerabilities equally. Classify by exploitability, then fix only what matters.

Other Tools We Built

If you're building secure applications, check out our other Claude Code skills:

  • Security Scanner ($10) — Semgrep-powered vulnerability detection with custom rules
  • API Connector ($7) — Build platform integrations that follow existing codebase patterns
  • Dashboard Builder ($7) — Generate monitoring dashboards from metrics specs

npm audit was designed for a world where devs manually reviewed each finding. That world doesn't exist. Automate the classification, fix what matters, ignore the rest.