Every check category built from published specs, peer-reviewed papers, and practitioner experience, not opinions. When we add a check, we document what informed it and why it holds up.
Every check traces to a published spec, a research paper, or documented practitioner experience. When we add one, we document what informed it and why it matters.
Checks run in layers: structure first, then semantics, then content quality, security, agent readiness, and whether the knowledge is substantive or well-formatted filler.
A strong gotchas section, concrete code references, or clear error handling show up as positive signals in your report, not just the absence of penalties.
Free tells you if the skill is built correctly. Pro tells you if it's built well. Every finding carries a severity: critical, warning, suggestion, or strength.
SkillCheck started from one lab's guidelines. From there: practitioner field observations, cross-lab methodology, peer-reviewed academic research, and the OWASP agentic security catalogue. Each round made the checks harder to game.
Regex scans the skill content skipping code blocks and frontmatter. Compound patterns require multiple signals on the same line to fire. Every result carries a severity that feeds the scoring engine.
Structural and pattern checks are reproducible: same input, same finding. Judgment checks have a wider tolerance band; the criteria are published so you can predict the outcome.
Required fields, file references, secrets, token counts. Exact, pass or fail, no ambiguity.
Anti-slop phrases, density signals, design patterns, governance checklists. Inspectable, read the rules and predict the outcome.
Contradictions, workflow clarity, subagent specificity, autonomy boundaries. Rubric-based against published criteria.
Every finding carries a severity. Strengths add no score but appear as positive signals in your report: proof that you built it well, not just not-wrongly.
Does the skill have the right structure, fields and sections? Free tells you what's missing. Open source, no install, no API key.
Is the content inside those sections actually good? Pro tells you whether what's there is real, security, slop, readiness, governance.
SkillCheck is an independent project, not affiliated with, endorsed by, or officially connected to Anthropic, OpenAI, Google, or any other AI lab. Research from those organizations informed specific check categories, as documented in the phases above. The implementation, scoring and quality judgments are SkillCheck's own.