About promptfoo
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
Topics
No rules target promptfoo yet
No published rules, MCP servers, or skills target promptfoo yet. If you maintain a tool that works well with this project, you can publish for free during beta.
Related topics
- Evaluating Claude Code skills: trigger precision and output quality (2026)A Claude Code skill has two failure modes: it never fires when it should, or it fires and produces generic output. Most teams test only one. Here's the eval bundle that covers both.
- LLM evals: the Hamel process encoded as rulesets (2026)Hamel Husain's eval process: 60-80% of dev time on error analysis, custom annotation tools, binary judges, review 100 traces. Here's how to encode that as a tool-agnostic ruleset that survives the next acquisition.
- Promptfoo alternatives after the OpenAI acquisition (2026)OpenAI acquired Promptfoo in March 2026. ClickHouse acquired Langfuse in January. Two of the three biggest OSS eval tools changed hands in 8 weeks. Here's what to use now.
Why this page exists
RuleSell tracks the AI-coding ecosystem so you don't have to. When a repo like promptfoo picks up momentum, we surface the Claude Code skills, Cursor rules, MCP servers, and agent configs that target it — with real author attribution, SPDX license badges, and quality scores. Every listing ships with copy-paste install for each environment.