Claude Code Skills
Reusable, folder-based extensions for Claude Code that encode domain knowledge, workflows, and tools the agent can invoke by name.
Last updated: 2026-04-12
Overview
Skills are Claude Code’s primary extension point. A common misconception is that skills are “just markdown files” — but they’re folders that can contain scripts, assets, data, config, and hooks, not just text. The agent can discover, read, and execute everything in that folder.
Skills map directly to the skill files concept in thin-harness-fat-skills: markdown procedures that encode judgment and process, invoked like method calls. Claude Code implements this concretely with frontmatter configuration, hook registration, and a marketplace distribution model.
Nine Skill Types
After cataloging hundreds of skills in active use at Anthropic, they cluster into nine categories:
| # | Type | Purpose | Example |
|---|---|---|---|
| 1 | Library & API Reference | How to use internal libs, CLIs, SDKs — edge cases + gotchas | billing-lib, frontend-design |
| 2 | Product Verification | Test/verify that code works; paired with Playwright, tmux, etc. | signup-flow-driver, checkout-verifier |
| 3 | Data Fetching & Analysis | Connect to data and monitoring stacks; specific dashboard IDs, join patterns | funnel-query, grafana |
| 4 | Business Process & Team Automation | Automate repetitive workflows into one command | standup-post, weekly-recap |
| 5 | Code Scaffolding & Templates | Generate framework boilerplate with org-specific annotations | new-migration, create-app |
| 6 | Code Quality & Review | Enforce code quality; can run via hooks or GitHub Actions | adversarial-review, code-style |
| 7 | CI/CD & Deployment | Fetch, push, deploy — with rollback and smoke tests | babysit-pr, deploy-<service> |
| 8 | Runbooks | Take a symptom, walk through multi-tool investigation, produce structured report | oncall-runner, log-correlator |
| 9 | Infrastructure Operations | Routine maintenance with guardrails for destructive actions | <resource>-orphans, cost-investigation |
Tips for Writing Skills
Don’t state the obvious
Claude already knows a lot about coding. Focus on information that pushes Claude out of its default patterns — org-specific conventions, internal gotchas, non-obvious edge cases.
Build a Gotchas section
The highest-signal content in any skill. Built up from real failure points Claude hits when using the skill. Keep updating it as new edge cases emerge.
Use the filesystem as progressive disclosure
A skill is a folder. Split detailed API docs into references/api.md, put output templates in assets/, add helper scripts in scripts/. Tell Claude what files are in the skill; it will read them at appropriate times. This prevents the main SKILL.md from becoming a wall of text.
Avoid railroading Claude
Skills are reusable across many situations. Give Claude the information it needs but leave flexibility to adapt. Overly prescriptive steps make skills brittle.
The description field is a trigger, not a summary
When Claude Code starts a session, it builds a listing of all skills with their descriptions. That listing is what Claude scans to decide “is there a skill for this?” The description should describe when to invoke the skill, not what it contains.
Config via config.json
For skills that need user-specific setup (e.g., which Slack channel), store config in a config.json in the skill directory. If missing, the agent asks the user to fill it in.
Skill-scoped memory
Store persistent state in ${CLAUDE_PLUGIN_DATA} (stable folder per plugin), not in the skill directory itself (which may be wiped on upgrade). A standups.log in a standup skill means the next run can diff against yesterday’s post.
Scripts > boilerplate
Include helper scripts and library functions so Claude spends its turns on composition and decisions, not reconstructing boilerplate. Example: a data science skill with fetch functions lets Claude generate analysis scripts on the fly.
On-demand hooks
Skills can register hooks that activate only for the duration of a session when the skill is invoked:
/careful— blocksrm -rf,DROP TABLE, force-push,kubectl delete/freeze— blocks edits outside a specific directory (useful while debugging)
Distributing Skills
Two channels:
- Repo-local: check into
./.claude/skills— works well for small teams, but each skill adds to model context - Plugin marketplace: upload to an internal marketplace; teams install only what they need
Marketplace curation pattern (Anthropic’s approach):
- No central gatekeeping team
- Skill owners post to a sandbox folder + share in Slack to build traction
- Once it gains users, owner PRs it into the marketplace
- Warning: it’s easy to create bad or redundant skills — curation before release matters
Composing skills: Reference other skills by name in your skill file. The model will invoke them if installed. (Native dependency management not yet built in.)
Measuring usage: Use a PreToolUse hook to log which skills are triggered. Find popular skills and undertriggering ones.
Enforcing Skill Compliance
Skills are advisory by default — the model can rationalize past any instruction. Two techniques help:
HARD-GATE directives: XML-tagged checkpoints that exploit the model’s disproportionate weight given to XML-like tags (recommended by Anthropic’s own prompt engineering docs):
<HARD-GATE>
Do NOT write any code until you have presented a design and the user has approved it.
This applies to EVERY project regardless of perceived simplicity.
</HARD-GATE>
Anti-rationalization tables: Pre-emptive lists of common model excuses paired with corrections. Models generate plausible reasons to skip steps because humans in training data do the same. The table short-circuits the pattern before it starts.
SkillsBench evidence: Claude Haiku with human-curated skills (27.7%) beat Claude Opus without skills (22.0%) across 86 tasks. Skills matter more than model size — but only if human-curated. AI-generated skills erased all gains entirely.
See agentic-engineering and paul-hoekstra for the full configuration layer framework.
Production Skills Pattern: Copy-on-Write Shadowing
From Fintool (financial services AI agent platform): skills are stored in S3 with a three-tier access model — private > shared > public. Users can override any skill by dropping a custom version in /private/skills/<name>/SKILL.md; their version wins automatically.
Why not just mount all skills to the filesystem? SQL discovery is better:
- Lazy loading: dozens of skills with extensive docs would burn tokens on every conversation. Discover skill metadata (name, description) upfront; only load full docs when the agent actually uses the skill
- Access control at query time: the SQL query enforces the three-tier model — you can’t accidentally expose one customer’s proprietary skills to another
- Shadowing logic: SQL makes priority rules (
private > shared > public) trivial; doing this with filesystem mounts requires fragile symlink ordering - Metadata-driven filtering: the
fs_files.metadatacolumn stores parsed YAML frontmatter, enabling filtering by skill type or other structured attributes without reading files
Pattern: S3 is source of truth → Lambda syncs to PostgreSQL → agent queries metadata first, loads full skill only on use.
Non-engineers can author skills: portfolio managers who’ve done 500 DCF valuations encode their methodology in markdown — no Python, no deployment. Domain expertise becomes agent capability directly.
Connections
- thin-harness-fat-skills — the design philosophy behind skill files; Claude Code is the concrete implementation
- coding-agent — the broader harness architecture skills live inside
- s3-first-architecture — the storage pattern underpinning production skill distribution
- nicolas-bustamante — Fintool’s production skills system
Sources
- Lessons from Building Claude Code: How We Use Skills — @trq212 — added 2026-04-12
- Local clip: Lessons from Building Claude Code How We Use Skills
- Lessons from Building AI Agents for Financial Services — @nicbstme — added 2026-04-12