GitHub - root-signals/root-signals-mcp: MCP for Root Signals Evaluation Platform (original) (raw)

Measurement & Control for LLM Automations

Root Signals MCP Server

A Model Context Protocol (MCP) server that exposes Root Signals evaluators as tools for AI assistants & agents.

Overview

This project serves as a bridge between Root Signals API and MCP client applications, allowing AI assistants and agents to evaluate responses against various quality criteria.

Features

Tools

The server exposes the following tools:

  1. list_evaluators - Lists all available evaluators on your Root Signals account
  2. run_evaluation - Runs a standard evaluation using a specified evaluator ID
  3. run_evaluation_by_name - Runs a standard evaluation using a specified evaluator name
  4. run_coding_policy_adherence - Runs a coding policy adherence evaluation using policy documents such as AI rules files
  5. list_judges - Lists all available judges on your Root Signals account. A judge is a collection of evaluators forming LLM-as-a-judge.
  6. run_judge - Runs a judge using a specified judge ID

How to use this server

1. Get Your API Key

Sign up & create a key or generate a temporary key

2. Run the MCP Server

docker run -e ROOT_SIGNALS_API_KEY= -p 0.0.0.0:9090:9090 --name=rs-mcp -d ghcr.io/root-signals/root-signals-mcp:latest

You should see some logs (note: /mcp is the new preferred endpoint; /sse is still available for backward‑compatibility)

docker logs rs-mcp 2025-03-25 12:03:24,167 - root_mcp_server.sse - INFO - Starting RootSignals MCP Server v0.1.0 2025-03-25 12:03:24,167 - root_mcp_server.sse - INFO - Environment: development 2025-03-25 12:03:24,167 - root_mcp_server.sse - INFO - Transport: stdio 2025-03-25 12:03:24,167 - root_mcp_server.sse - INFO - Host: 0.0.0.0, Port: 9090 2025-03-25 12:03:24,168 - root_mcp_server.sse - INFO - Initializing MCP server... 2025-03-25 12:03:24,168 - root_mcp_server - INFO - Fetching evaluators from RootSignals API... 2025-03-25 12:03:25,627 - root_mcp_server - INFO - Retrieved 100 evaluators from RootSignals API 2025-03-25 12:03:25,627 - root_mcp_server.sse - INFO - MCP server initialized successfully 2025-03-25 12:03:25,628 - root_mcp_server.sse - INFO - SSE server listening on http://0.0.0.0:9090/sse

From all other clients that support SSE transport - add the server to your config, for example in Cursor:

{ "mcpServers": { "root-signals": { "url": "http://localhost:9090/sse" } } }

with stdio from your MCP host

In cursor / claude desktop etc:

{ "mcpServers": { "root-signals": { "command": "uvx", "args": ["--from", "git+https://github.com/root-signals/root-signals-mcp.git", "stdio"], "env": { "ROOT_SIGNALS_API_KEY": "" } } } }

Usage Examples

1. Evaluate and improve Cursor Agent explanations

Let's say you want an explanation for a piece of code. You can simply instruct the agent to evaluate its response and improve it with Root Signals evaluators:

Use case example image 1

After the regular LLM answer, the agent can automatically

Use case example image 2

It can then automatically evaluate the second attempt again to make sure the improved explanation is indeed higher quality:

Use case example image 3

2. Use the MCP reference client directly from code

from root_mcp_server.client import RootSignalsMCPClient

async def main(): mcp_client = RootSignalsMCPClient()

try:
    await mcp_client.connect()
    
    evaluators = await mcp_client.list_evaluators()
    print(f"Found {len(evaluators)} evaluators")
    
    result = await mcp_client.run_evaluation(
        evaluator_id="eval-123456789",
        request="What is the capital of France?",
        response="The capital of France is Paris."
    )
    print(f"Evaluation score: {result['score']}")
    
    result = await mcp_client.run_evaluation_by_name(
        evaluator_name="Clarity",
        request="What is the capital of France?",
        response="The capital of France is Paris."
    )
    print(f"Evaluation by name score: {result['score']}")
    
    result = await mcp_client.run_evaluation(
        evaluator_id="eval-987654321",
        request="What is the capital of France?",
        response="The capital of France is Paris.",
        contexts=["Paris is the capital of France.", "France is a country in Europe."]
    )
    print(f"RAG evaluation score: {result['score']}")
    
    result = await mcp_client.run_evaluation_by_name(
        evaluator_name="Faithfulness",
        request="What is the capital of France?",
        response="The capital of France is Paris.",
        contexts=["Paris is the capital of France.", "France is a country in Europe."]
    )
    print(f"RAG evaluation by name score: {result['score']}")
    
finally:
    await mcp_client.disconnect()

3. Measure your prompt templates in Cursor

Let's say you have a prompt template in your GenAI application in some file:

summarizer_prompt = """ You are an AI agent for the Contoso Manufacturing, a manufacturing that makes car batteries. As the agent, your job is to summarize the issue reported by field and shop floor workers. The issue will be reported in a long form text. You will need to summarize the issue and classify what department the issue should be sent to. The three options for classification are: design, engineering, or manufacturing.

Extract the following key points from the text:

Safety

user: {{problem}} """

You can measure by simply asking Cursor Agent: Evaluate the summarizer prompt in terms of clarity and precision. use Root Signals. You will get the scores and justifications in Cursor:

Prompt evaluation use case example image 1

For more usage examples, have a look at demonstrations

How to Contribute

Contributions are welcome as long as they are applicable to all users.

Minimal steps include:

  1. uv sync --extra dev
  2. pre-commit install
  3. Add your code and your tests to src/root_mcp_server/tests/
  4. docker compose up --build
  5. ROOT_SIGNALS_API_KEY=<something> uv run pytest . - all should pass
  6. ruff format . && ruff check --fix

Limitations

Network Resilience

Current implementation does not include backoff and retry mechanisms for API calls:

Bundled MCP client is for reference only

This repo includes a root_mcp_server.client.RootSignalsMCPClient for reference with no support guarantees, unlike the server. We recommend your own or any of the official MCP clients for production use.