← Methodology

v3.22.0 1 / 9

SkillCheck
Methodology

27

categories

14

sources

7

phases

7

months

getskillcheck.com/methodology

→ swipe to explore

Dec 2025 → Jun 2026

Phase I · Build to Spec 2 / 9

Build to
Spec

v1.0–1.4 · Dec 2025 · 3 sources

10 categories
1–10

Categories

01Structure
06Visual
02Body
07Security
03Naming
08Quality
04Semantics
09Token
05Anti-Slop
10Enterprise

R1 Claude Code Best Practices · Anthropic anthropic.com ↗ R2 MCP Best Practices modelcontextprotocol.io ↗ R3 WCAG 2.2 · W3C w3.org ↗

Phase II · From Practice 3 / 9

From
Practice

v3.10–3.14 · Mar 2026 · 1 source

08 categories
11–18

Categories

11Quality Pro
15Orch. Safety
12Workflow
16Autonomy
13Ref. Integrity
17Composability
14Eval Readiness
18Observability

R4 Lessons from Building Claude Code · T. Shihipar linkedin.com ↗

Phase III · From the Field 4 / 9

From the
Field

v3.16–3.17 · Apr 2026 · 2 sources

03 categories
19–21

Categories

19Design Pattern
20Trigger Collision
21Eval Kit

R5 Testing Agent Skills Systematically with Evals · OpenAI openai.com ↗ R6 5 Agent Skill Design Patterns · Google Cloud x.com ↗

Phase IV · Ecosystem 5 / 9

Ecosystem

v3.18.0 · Apr 2026 · 3 sources

01 category
22

Category

22Knowledge Density

A single category, but a decisive one. Anchored by Hasan and Wang's arXiv papers measuring description-quality smells across thousands of MCP servers.

R7 MCP Tool Descriptions Are Smelly! · Hasan et al. arxiv.org ↗ R8 From Docs to Descriptions: Smell-Aware Evaluation · Wang et al. arxiv.org ↗ R9 Tool Description Quality Score (TDQS) · Glama glama.ai ↗

Phase V · Marketplace 6 / 9

Marketplace

v3.20.0 · Apr 2026 · 3 sources

03 categories
23–25

Categories

23Agent Integration Readiness
24Marketplace Governance
25Memory Governance

R10 Building Agents That Reach Production with MCP · Anthropic claude.com ↗ R11 MCP vs CLIs · S. Morrow youtu.be ↗ R12 anthropics/knowledge-work-plugins github.com ↗

Phase VI · Agentic Safety 7 / 9

Agentic
Safety

v3.21.0 · Jun 2026 · 1 source

01 category
26

Category

26OWASP Agentic Top 10

Checks a skill against the OWASP Top 10 for Agentic Applications. Eight deterministic risks run free and local; three intent-based risks are graded by your own AI judge, no API key required.

R13 Top 10 for Agentic Applications 2026 · OWASP genai.owasp.org ↗

Phase VII · Lived-In-Ness 8 / 9

Lived-In-
Ness

v3.22.0 · Jun 2026 · 1 source

01 category
27

Category

27Repo Maturity

Reads a skill's git history to answer what the SKILL.md can't: is anyone actually using this, or was it written once and abandoned? Scored in its own domain, so a one-commit skill can still be excellent.

R14 Show Us Your (Agent) Skills · Vanishing Gradients github.com ↗

Empirical Anchors 9 / 9

Hasan et al. · description smell rate

97%

of 856 MCP tools across 103 servers contained at least one description smell.

arXiv:2602.14878 ↗

Wang et al. · selection probability

72%vs20%

Across 10,831 MCP servers, standard-compliant descriptions get picked by an agent 72% of the time. Non-compliant descriptions get picked 20% of the time.

arXiv:2602.18914 ↗

Starting score

100

Critical −20 · Warning −5 · Suggestion −1

getskillcheck.com/methodology built by Olga Safonova

Navigate with ← / → or click the dots. Append ?clean=1#s2 to URL to view a single slide without nav (used for PNG export).