GitHub - mandoline-ai/mandoline-mcp-server: MCP server that enables LLMs to evaluate themselves (original) (raw)

Mandoline MCP Server

Enable AI assistants like Claude Code, Claude Desktop, and Cursor to reflect on, critique, and continuously improve their own performance using Mandoline's evaluation framework via the Model Context Protocol.

Client Setup

Most users should start here. Use Mandoline's hosted MCP server to integrate evaluation tools into your AI assistant.

For each integration below, replace sk_**** with your actual API key from mandoline.ai/account.

Claude Code

Use the CLI to add the Mandoline MCP server to Claude Code:

claude mcp add --scope user --transport http mandoline https://mandoline.ai/mcp --header "x-api-key: sk_****"

You can use --scope user (across projects) or --scope project (current project only).

Note: Restart any active Claude Code sessions after configuration changes.

Verify: Run /mcp in Claude Code to see Mandoline listed as a connected server:

Tutorial: Watch Claude evaluate multiple code solutions and pick the best one.

Official Documentation: Claude Code MCP Guide

Codex

Use the CLI to add the Mandoline MCP server to Codex:

codex mcp add mandoline --env MANDOLINE_API_KEY=sk_**** -- npx -y mcp-remote https://mandoline.ai/mcp --header 'x-api-key: ${MANDOLINE_API_KEY}'

Note: Restart any active Codex sessions after configuration changes.

Verify: Run /mcp in Codex to see Mandoline listed as a connected server:

Official Documentation: Codex MCP Configuration

Claude Desktop

Edit your configuration file (Settings > Developer > Edit Config):

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json

{ "mcpServers": { "Mandoline": { "command": "npx", "args": [ "-y", "mcp-remote", "https://mandoline.ai/mcp", "--header", "x-api-key: ${MANDOLINE_API_KEY}" ], "env": { "MANDOLINE_API_KEY": "sk_****" } } } }

This configuration applies globally to all conversations.

Note: Restart Claude Desktop after configuration changes.

Verify: Look for Mandoline tools when you click the "Search and tools" button.

Official Documentation: MCP Quickstart Guide

Cursor

Create or edit your MCP configuration file:

{ "mcpServers": { "Mandoline": { "url": "https://mandoline.ai/mcp", "headers": { "x-api-key": "sk_****" } } } }

You can use your global configuration (affects all projects) ~/.cursor/mcp.json or project-local configuration (current project only) .cursor/mcp.json (in project root)

Note: Restart Cursor after configuration changes.

Verify: Check the Output panel (Ctrl+Shift+U) → "MCP Logs" for successful connection, or look for Mandoline tools in the Composer Agent.

Official Documentation: Cursor MCP Guide

Server Setup

Only needed if you want to run the server locally or contribute to development. Most users should use the hosted server above.

Prerequisites: Node.js 18+ and npm

Installation

Clone and build
git clone https://github.com/mandoline-ai/mandoline-mcp-server.git
cd mandoline-mcp-server
npm install
npm run build
Configure environment (optional)
cp .env.example .env.local

Edit .env.local to customize PORT, LOG_LEVEL, etc.

Start the server

The server runs on http://localhost:8080 by default.

Using Local Server

To use your local server instead of the hosted one, replace https://mandoline.ai/mcp with http://localhost:8080/mcp in the client configurations above.

Usage

Once integrated, you can use Mandoline evaluation tools directly in your AI assistant conversations.

Tools

Health

Tool	Purpose
get_server_health	Confirm the MCP server is reachable and returning a healthy status payload.

Metrics

Tool	Purpose
create_metric	Define custom evaluation criteria for your specific tasks
batch_create_metrics	Create multiple evaluation metrics in one operation
get_metric	Retrieve details about a specific metric
get_metrics	Browse your metrics with filtering and pagination
update_metric	Modify existing metric definitions

Evaluations

Tool	Purpose
create_evaluation	Score prompt/response pairs against your metrics
batch_create_evaluations	Evaluate the same content against multiple metrics
get_evaluation	Retrieve evaluation results and scores
get_evaluations	Browse evaluation history with filtering and pagination
update_evaluation	Add metadata or context to evaluations

Resources

Resource	Description
llms.txt	Mandoline docs index (tools, tutorials, blogs, leaderboards, SDKs); mirrored from https://mandoline.ai/llms.txt.
mcp	MCP setup guide for assistants; mirrored from https://mandoline.ai/mcp.

Support

Platform: https://mandoline.ai - Create account and get API keys
Documentation: https://mandoline.ai/docs - Evaluation guides and best practices
Issues: GitHub Issues - Bug reports and feature requests
Email: support@mandoline.ai - Direct support

License

Apache-2.0 License - see the LICENSE file for details.