skills-check

The missing quality toolkit for Agent Skills
Freshness, security, quality, and efficiency — 12 commands, one toolkit

npx skills-check check

What is an Agent Skill?

Agent Skills are the knowledge layer between AI coding agents and the tools they use.

Skills are SKILL.md files — markdown documents with YAML frontmatter that instruct AI coding agents like Claude Code, Cursor, and Codex how to work with specific products, frameworks, and patterns. They look like documentation, but they’re treated as executable instructions.

Skills are loaded directly into an agent’s LLM context window, where they shape every decision the agent makes. A skill telling an agent to use a deprecated API or a nonexistent package isn’t just inaccurate — it produces broken code.

Because agents have file system and shell access, skill quality is a security and correctness concern, not just a readability one. A skill suggesting rm -rf or referencing a typosquatted package poses real risk.

The ecosystem splits responsibilities: skills.sh handles installation and distribution, while skills-check handles verification and quality — freshness, security, linting, token budgets, semver accuracy, policy enforcement, and testing.

Example SKILL.md frontmatter

---
name: react
description: Modern React patterns and best practices
version: 1.2.0
compatibility: "react@^19.1.0"
author: skills-community
---

The problem

Agent skills are treated like documentation, but they’re really executable instructions. They go stale, reference packages that don’t exist, suggest dangerous commands, and silently bloat your context window. Nobody notices until something breaks.

⚠

Silent staleness

A renamed package, a deprecated API, a missing parameter — stale skills don’t always fail loudly. Sometimes they just quietly produce worse outcomes. A skill referencing React 18 APIs when React 19 is current can cause agents to generate incompatible code.

🛡

Safety is a blindspot

Skills can reference hallucinated packages, contain prompt injection patterns, or suggest commands that delete data. A skill recommending npm audit fix --force can break production dependencies. Without auditing, you’re trusting unknown instructions.

📦

Code has dependency management. Skills don’t.

npm outdated tells you when packages are behind. Dependabot opens PRs. But for agent knowledge? Nothing. A skill referencing @langchain/core v0.1 patterns when v0.3 has breaking API changes leaves your agents working with outdated instructions.

✅

skills-check fixes this

14 commands covering freshness, security, quality, token budgets, semver verification, and policy enforcement — everything you need to keep agent skills correct, safe, and efficient.

Read the full story: Your Agent’s Knowledge Has a Shelf Life→

How it compares

skills-check fills a gap that existing tools weren’t designed for.

🧹

Unlike linters (ESLint, Biome)

skills-check validates knowledge accuracy, not code syntax. It detects when a skill references React 18 patterns while React 19 is current, or when a recommended API has been deprecated.

🛡

Unlike security scanners (Snyk, Socket)

skills-check detects hallucinated packages and skill-specific injection patterns. It verifies that every package referenced in a skill actually exists on npm, PyPI, or crates.io.

📖

Unlike documentation tools

skills-check treats SKILL.md files as executable instructions and validates them accordingly — measuring token cost, enforcing organizational policy, and running eval test suites for regression detection.

14 commands, one toolkit

Everything you need to keep agent skills fresh, safe, and efficient.

Freshness & Currency

✓

check

Detect version drift by comparing skill frontmatter against the npm registry.

↻

refresh

AI-assisted updates to stale skills using LLMs. Fetches changelogs and generates diffs.

⚑

report

Generate a formatted staleness report in markdown or JSON for your team or CI.

Security & Quality

⚡

audit

Scan for hallucinated packages, prompt injection, dangerous commands, and dead URLs.

✦

lint

Validate metadata completeness, structural quality, and format in skill files.

⊞

policy

Enforce organizational trust rules for skills via .skill-policy.yml policy-as-code.

Analysis & Verification

≡

budget

Measure token cost per skill, detect redundancy, and track context window usage over time.

⚐

verify

Validate that content changes between skill versions match the declared semver bump.

▷

test

Run eval test suites declared in skill tests/ directories for regression detection.

🗝

fingerprint

Generate a fingerprint registry of installed skills with content hashes and watermarks.

▓

usage

Analyze skill telemetry events for usage patterns, cost estimation, and policy compliance.

Setup

➜

init

Scan a skills directory for SKILL.md files and generate a skills-check.json registry.

Diagnostics & Maintenance

⚕

doctor

Validate environment prerequisites and release readiness.

⚒

fix

Apply deterministic, non-LLM autofixes to skill files.

Basic — check freshness on every push

name: Skills Check
on:
  push:
    paths: ["**SKILL.md"]
  schedule:
    - cron: "0 9 * * 1" # weekly Monday 9am

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: voodootikigod/skills-check@v1

Multi-command — audit, lint, and budget in one step

- uses: voodootikigod/skills-check@v1
  with:
    commands: "check,audit,lint,budget"
    audit-fail-on: high
    budget-max-tokens: 50000

All 14 commands

check, audit, lint, budget, verify, test, policy, refresh, report, init, fingerprint, usage, doctor, fix

Configurable thresholds

Fail on specific severity levels per command — audit-fail-on, lint-fail-on, budget-max-tokens

Structured outputs

JSON results, per-command exit codes, and finding counts for downstream steps

Quickstart

Five steps to keep your agent skills fresh, safe, and efficient.

Initialize your registry

Create a registry to track which products your skills cover and their verified versions.

Discover SKILL.md files and map them to npm packages.

npx skills-check init

Check freshness and audit safety

Scan for version drift and security issues — hallucinated packages, prompt injection, dangerous commands.

Detect version drift and scan for security issues in one pass.

npx skills-check check && npx skills-check audit

Lint, budget, and verify

Validate metadata, measure context window token cost, and verify version bumps match content changes.

Validate metadata, measure token costs, and confirm version bumps are honest.

npx skills-check lint && npx skills-check budget && npx skills-check verify

Enforce policy and test

Enforce organizational rules and run eval test suites for regression detection.

Apply organizational trust rules and run eval test suites.

npx skills-check policy check && npx skills-check test

Refresh stale skills

AI-assisted updates for stale skills and formatted reports for review.

Use an LLM to propose targeted updates and generate a report.

npx skills-check refresh && npx skills-check report