← Methodology

SkillCheck
Methodology

25categories 14sources 5phases 5months

Five phases of rubric construction, from December 2025 to April 2026. Each phase added a defined set of categories; block width encodes how many. Every reference chip is clickable.

Phase I

Build to Spec

v1.0 – v1.4 · Dec 2025
10categories 1–10
  1. 01Structure
  2. 02Body
  3. 03Naming
  4. 04Semantics
  5. 05Anti-Slop
  6. 06Visual
  7. 07Security
  8. 08Quality
  9. 09Token
  10. 10Enterprise
R1 R2 R3
Phase II

From Practice

v3.10 – v3.14 · Mar 2026
08categories 11–18
  1. 11Quality Pro
  2. 12Workflow
  3. 13Reference Integrity
  4. 14Eval Readiness
  5. 15Orchestration Safety
  6. 16Autonomy Design
  7. 17Composability
  8. 18Observability
R4
Phase III

From Field

v3.16 – v3.17 · Apr 2026
03categories 19–21
  1. 19Design Pattern
  2. 20Trigger Collision
  3. 21Eval Kit
R5 R6
Phase IV

Ecosystem

v3.18.0 · Apr 2026
01cat. 22
  1. 22Knowledge Density
R7 R8 R9
Phase V

Marketplace

v3.20.0 · Apr 2026
03categories 23–25
  1. 23Agent Integration
  2. 24Mkt. Governance
  3. 25Memory Governance

Hasan et al.

97%

of 856 MCP tools across 103 servers contained at least one description smell.

arXiv:2602.14878 · Phase IV anchor

Wang et al.

72%vs.20%

Across 10,831 MCP servers, standard-compliant descriptions get picked by an agent 72% of the time. Non-compliant descriptions get picked 20% of the time.

arXiv:2602.18914 · Phase IV anchor

Scoring Math

100starting score
Critical −20
Warning −5
Suggestion −1
Strength +0

Source Bank · 14 references

01Phase I
Claude Code Best Practices
Anthropic
anthropic.com ↗
02Phase I
MCP Best Practices
modelcontextprotocol.io
modelcontextprotocol.io ↗
03Phase I
WCAG 2.2
W3C
w3.org ↗
04Phase II
Lessons from Building Claude Code
T. Shihipar, Anthropic
linkedin.com ↗
05Phase III
Testing Agent Skills Systematically with Evals
OpenAI
developers.openai.com ↗
06Phase III
5 Agent Skill Design Patterns
Google Cloud Tech
x.com ↗
07Phase IV
MCP Tool Descriptions Are Smelly!
Hasan et al.
arxiv.org ↗
08Phase IV
From Docs to Descriptions: Smell-Aware Evaluation
Wang et al.
arxiv.org ↗
09Phase IV
Tool Description Quality Score (TDQS)
Glama
glama.ai ↗
10Phase V
Building Agents That Reach Production with MCP
Anthropic · Apr 2026
claude.com ↗
11Phase V
MCP vs CLIs: Why Agents Need Purpose-Built Interfaces
S. Morrow · Apr 2026
youtu.be ↗
12Phase V
knowledge-work-plugins
anthropics/ · GitHub
github.com ↗
13Phase V
servers
modelcontextprotocol/ · GitHub
github.com ↗
14Phase V
Built-in Memory for Claude Managed Agents
Anthropic · Apr 2026
claude.com ↗