GitHub - shinpr/mcp-image: MCP server for AI image generation and editing powered by Gemini 3 Pro Image Preview (Nano Banana Pro 🍌). For Cursor, Codex & more. (original) (raw)

🍌 MCP Image Generator

Powered by Gemini 3 Pro Image - Nano Banana Pro 🍌

A powerful MCP (Model Context Protocol) server that enables AI assistants to generate and edit images using Google's Gemini 3 Pro Image (Nano Banana Pro 🍌). Seamlessly integrate advanced image generation capabilities into Codex, Cursor, Claude Code, and other MCP-compatible AI tools.

✨ Features

🔧 Prerequisites

🚀 Quick Start

1. Get Your Gemini API Key

Get your API key from Google AI Studio

2. MCP Configuration

For Codex

Add to ~/.codex/config.toml:

[mcp_servers.mcp-image] command = "npx" args = ["-y", "mcp-image"]

[mcp_servers.mcp-image.env] GEMINI_API_KEY = "your_gemini_api_key_here" IMAGE_OUTPUT_DIR = "/absolute/path/to/images"

For Cursor

Add to your Cursor settings:

{ "mcpServers": { "mcp-image": { "command": "npx", "args": ["-y", "mcp-image"], "env": { "GEMINI_API_KEY": "your_gemini_api_key_here", "IMAGE_OUTPUT_DIR": "/absolute/path/to/images" } } } }

For Claude Code

Run in your project directory to enable for that project:

cd /path/to/your/project claude mcp add mcp-image --env GEMINI_API_KEY=your-api-key --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image

Or add globally for all projects:

claude mcp add mcp-image --scope user --env GEMINI_API_KEY=your-api-key --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image

⚠️ Security Note: Never commit your API key to version control. Keep it secure and use environment-specific configuration.

📁 Path Requirements:

Optional: Skip Prompt Enhancement

Set SKIP_PROMPT_ENHANCEMENT=true to disable automatic prompt optimization and send your prompts directly to the image generator. Useful when you need full control over the exact prompt wording.

Codex:

[mcp_servers.mcp-image.env] GEMINI_API_KEY = "your_gemini_api_key_here" SKIP_PROMPT_ENHANCEMENT = "true" IMAGE_OUTPUT_DIR = "/absolute/path/to/images"

**Cursor:**Add "SKIP_PROMPT_ENHANCEMENT": "true" to the env section in your config.

Claude Code:

claude mcp add mcp-image --env GEMINI_API_KEY=your-api-key --env SKIP_PROMPT_ENHANCEMENT=true --env IMAGE_OUTPUT_DIR=/absolute/path/to/images -- npx -y mcp-image

📖 Usage Examples

Once configured, your AI assistant can generate images using natural language:

Basic Image Generation

"Generate a serene mountain landscape at sunset with a lake reflection"

The system automatically enhances this to include rich details about lighting, materials, composition, and atmosphere for optimal results.

Image Editing

"Edit this image to make the person face right"
(with inputImagePath: "/path/to/image.jpg")

Advanced Features

Character Consistency:

"Generate a portrait of a medieval knight, maintaining character consistency for future variations"
(with maintainCharacterConsistency: true)

High-Resolution 4K Generation:

"Generate a professional product photo of a smartphone with clear text on the screen"
(with imageSize: "4K")

Custom Aspect Ratio:

"Generate a cinematic landscape of a desert at golden hour"
(with aspectRatio: "21:9")

🔧 API Reference

generate_image Tool

The MCP server exposes a single tool for all image operations. Internally, it uses a two-stage process:

  1. Prompt Optimization: Gemini 2.0 Flash analyzes and enriches your prompt
  2. Image Generation: Gemini 3 Pro Image creates the final image

Parameters

Parameter Type Required Description
prompt string Text description or editing instruction
inputImagePath string - Absolute path to input image for editing
fileName string - Custom filename for output (auto-generated if not specified)
aspectRatio string - Aspect ratio for the generated image. Supported values: 1:1 (square, default), 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
imageSize string - Image resolution for high-quality output. Specify 2K or 4K for higher resolution images with better text rendering and fine details. Leave unspecified for standard quality. Supported values: 2K, 4K
blendImages boolean - Enable multi-image blending for combining multiple visual elements naturally
maintainCharacterConsistency boolean - Maintain character appearance consistency across different poses and scenes
useWorldKnowledge boolean - Use real-world knowledge for accurate context (recommended for historical figures, landmarks, or factual scenarios)
useGoogleSearch boolean - Enable Google Search grounding to access real-time web information for factually accurate image generation. Use when prompt requires current or time-sensitive data that may have changed since the model's knowledge cutoff. Leave disabled for creative, fictional, historical, or timeless content.
purpose string - Intended use for the image (e.g., "cookbook cover", "social media post", "presentation slide"). Helps tailor visual style, quality level, and details to match the purpose.

Response

{ "type": "resource", "resource": { "uri": "file:///path/to/generated/image.png", "name": "image-filename.png", "mimeType": "image/png" }, "metadata": { "model": "gemini-3-pro-image-preview", "processingTime": 5000, "timestamp": "2024-01-01T12:00:00.000Z" } }

🛠️ Troubleshooting

Common Issues

"API key not found"

"Input image file not found"

"No image data found in Gemini API response"

Performance Tips

💰 Usage Notes

📄 License

MIT License - see LICENSE for details.


Need help? Open an issue or check the troubleshooting section above.