Run Skill Eval
Run the skill's behavioral benchmark now and return the verdict. Bootstraps
cases from evidence if none exist. strict=true runs the with/without
median-of-3 strict-improvement gate (Phase 2); default is the cheap single
with-skill run (Phase 1). By-id-direct; cross-space. The streaming variant is
/eval/stream.
Path Parameters
skill_id*Skill Id
Query Parameters
strict?Strict
Response Body
application/json
application/json
curl -X POST "https://example.com/skills/string/eval/run"{}{
"detail": [
{
"loc": [
"string"
],
"msg": "string",
"type": "string",
"input": null,
"ctx": {}
}
]
}