MCP Evals Documentation (original) (raw)
Overview
MCP Evals is a Node.js package and GitHub Action for evaluating MCP (Model Context Protocol) tool implementations using LLM-based scoring. This helps ensure your MCP server's tools are working correctly and performing well.
Installation
As a Node.js Package
As a GitHub Action
Add the following to your workflow file:
Usage
1. Create Your Evaluation File
Create a file (e.g., `evals.ts`) that exports your evaluation configuration:
2. Run the Evaluations
As a Node.js Package
You can run the evaluations using the CLI:
As a GitHub Action
The action will automatically:
- Run your evaluations
- Post the results as a comment on the PR
- Update the comment if the PR is updated
Evaluation Results
Each evaluation returns an object with the following structure:
Configuration
Environment Variables
OPENAI_API_KEY
: Your OpenAI API key (required)
Evaluation Configuration
The EvalConfig
interface requires:
model
: The language model to use for evaluation (e.g., GPT-4)evals
: Array of evaluation functions to run
Each evaluation function must implement:
name
: Name of the evaluationdescription
: Description of what the evaluation testsrun
: Async function that takes a model and returns anEvalResult
License
MIT