GitHub - call518/MCP-Airflow-API: ๐Ÿ”Model Context Protocol (MCP) server for Apache Airflow API integration. Provides comprehensive tools for managing Airflow clusters including service operations, configuration management, status monitoring, and request tracking. (original) (raw)

๐Ÿš€ MCP-Airflow-API

Revolutionary Open Source Tool for Managing Apache Airflow with Natural Language

License: MIT Python Docker Pulls smithery badge BuyMeACoffee

Deploy to PyPI with tag PyPI PyPI - Downloads


Architecture & Internal (DeepWiki)

Ask DeepWiki


๐Ÿ“‹ Overview

Have you ever wondered how amazing it would be if you could manage your Apache Airflow workflows using natural language instead of complex REST API calls or web interface manipulations? MCP-Airflow-API is the revolutionary open-source project that makes this goal a reality.

MCP-Airflow-API Screenshot


๐ŸŽฏ What is MCP-Airflow-API?

MCP-Airflow-API is an MCP server that leverages the Model Context Protocol (MCP) to transform Apache Airflow REST API operations into natural language tools. This project hides the complexity of API structures and enables intuitive management of Airflow clusters through natural language commands.

๐Ÿ†• Multi-Version API Support (NEW!)

Now supports both Airflow API v1 (2.x) and v2 (3.0+) with dynamic version selection via environment variable:

Key Architecture: Single MCP server with shared common tools (43) plus v2-exclusive asset tools (2) - dynamically loads appropriate toolset based on AIRFLOW_API_VERSION environment variable!

Traditional approach (example):

curl -X GET "http://localhost:8080/api/v1/dags?limit=100&offset=0"
-H "Authorization: Basic YWlyZmxvdzphaXJmbG93"

MCP-Airflow-API approach (natural language):

"Show me the currently running DAGs"


๐Ÿš€ Quickstart

๐Ÿ“ Need a test Airflow cluster? Use our companion project Airflow-Docker-Compose with support for both Airflow 2.x and Airflow 3.x environments!

Flow Diagram of Quickstart/Tutorial

Flow Diagram of Quickstart/Tutorial

For quick evaluation and testing:

git clone https://github.com/call518/MCP-Airflow-API.git cd MCP-Airflow-API

Configure your Airflow credentials

cp .env.example .env

Edit .env with your Airflow API settings

Start all services

docker-compose up -d

Access OpenWebUI at http://localhost:3002/

API documentation at http://localhost:8002/docs

Getting Started with OpenWebUI (Docker Option)

  1. Access http://localhost:3002/
  2. Log in with admin account
  3. Go to "Settings" โ†’ "Tools" from the top menu
  4. Add Tool URL: http://localhost:8002/airflow-api
  5. Configure your LLM provider (Ollama, OpenAI, etc.)

๐Ÿ“ฆ MCP Server Installation Methods

Method 1: Direct Installation from PyPI

uvx --python 3.12 mcp-airflow-api

Method 2: Claude-Desktop MCP Client Integration

Local Access (stdio mode)

{ "mcpServers": { "mcp-airflow-api": { "command": "uvx", "args": ["--python", "3.12", "mcp-airflow-api"], "env": { "AIRFLOW_API_VERSION": "v2", "AIRFLOW_API_BASE_URL": "http://localhost:8080/api", "AIRFLOW_API_USERNAME": "airflow", "AIRFLOW_API_PASSWORD": "airflow" } } } }

Remote Access (streamable-http mode without authentication)

{ "mcpServers": { "mcp-airflow-api": { "type": "streamable-http", "url": "http://localhost:8000/mcp" } } }

Remote Access (streamable-http mode with Bearer token authentication - Recommended)

{ "mcpServers": { "mcp-airflow-api": { "type": "streamable-http", "url": "http://localhost:8000/mcp", "headers": { "Authorization": "Bearer your-secure-secret-key-here" } } } }

Multiple Airflow Clusters with Different Versions

{ "mcpServers": { "airflow-2x-cluster": { "command": "uvx", "args": ["--python", "3.12", "mcp-airflow-api"], "env": { "AIRFLOW_API_VERSION": "v1", "AIRFLOW_API_BASE_URL": "http://localhost:38080/api", "AIRFLOW_API_USERNAME": "airflow", "AIRFLOW_API_PASSWORD": "airflow" } }, "airflow-3x-cluster": { "command": "uvx", "args": ["--python", "3.12", "mcp-airflow-api"], "env": { "AIRFLOW_API_VERSION": "v2", "AIRFLOW_API_BASE_URL": "http://localhost:48080/api", "AIRFLOW_API_USERNAME": "airflow", "AIRFLOW_API_PASSWORD": "airflow" } } } }

๐Ÿ’ก Pro Tip: Use the test clusters from Airflow-Docker-Compose for the above configuration - they run on ports 38080 (2.x) and 48080 (3.x) respectively!

Method 3: Development Installation

git clone https://github.com/call518/MCP-Airflow-API.git cd MCP-Airflow-API pip install -e .

Run in stdio mode

python -m mcp_airflow_api


๐ŸŒŸ Key Features

  1. Natural Language Queries
    No need to learn complex API syntax. Just ask as you would naturally speak:
    • "What DAGs are currently running?"
    • "Show me the failed tasks"
    • "Find DAGs containing ETL"
  2. Comprehensive Monitoring Capabilities
    Real-time cluster status monitoring:
    • Cluster health monitoring
    • DAG status and performance analysis
    • Task execution log tracking
    • XCom data management
  3. Dynamic API Version Support
    Single MCP server adapts to your Airflow version:
    • API v1: 43 shared tools for Airflow 2.x compatibility
    • API v2: 43 shared tools + 2 asset management tools for Airflow 3.0+
    • Environment Variable Control: Switch versions instantly with AIRFLOW_API_VERSION
    • Zero Configuration Changes: Same tool names, enhanced capabilities
    • Efficient Architecture: Shared common codebase eliminates duplication
  4. Comprehensive Tool Coverage
    Covers almost all Airflow API functionality:
    • DAG management (trigger, pause, resume)
    • Task instance monitoring
    • Pool and variable management
    • Connection configuration
    • Configuration queries
    • Event log analysis
  5. Large Environment Optimization
    Efficiently handles large environments with 1000+ DAGs:
    • Smart pagination support
    • Advanced filtering options
    • Batch processing capabilities

๐Ÿ› ๏ธ Technical Advantages


Use Cases in Action

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams

Capacity Management for Operations Teams


โš™๏ธ Advanced Configuration

Environment Variables

Required - Dynamic API Version Selection (NEW!)

Single server supports both v1 and v2 - just change this variable!

AIRFLOW_API_VERSION=v1 # v1 for Airflow 2.x, v2 for Airflow 3.0+ AIRFLOW_API_BASE_URL=http://localhost:8080/api

Test Cluster Connection Examples:

For Airflow 2.x test cluster (from Airflow-Docker-Compose)

AIRFLOW_API_VERSION=v1 AIRFLOW_API_BASE_URL=http://localhost:38080/api

For Airflow 3.x test cluster (from Airflow-Docker-Compose)

AIRFLOW_API_VERSION=v2 AIRFLOW_API_BASE_URL=http://localhost:48080/api

Authentication

AIRFLOW_API_USERNAME=airflow AIRFLOW_API_PASSWORD=airflow

Optional - MCP Server Configuration

MCP_LOG_LEVEL=INFO # DEBUG/INFO/WARNING/ERROR/CRITICAL FASTMCP_TYPE=stdio # stdio/streamable-http FASTMCP_PORT=8000 # HTTP server port (Docker mode)

Bearer Token Authentication for streamable-http mode

Enable authentication (recommended for production)

Default: false (when undefined, empty, or null)

Values: true/false, 1/0, yes/no, on/off (case insensitive)

REMOTE_AUTH_ENABLE=false # true/false REMOTE_SECRET_KEY=your-secure-secret-key-here

API Version Comparison

Official Documentation:

Feature API v1 (Airflow 2.x) API v2 (Airflow 3.0+)
Total Tools 43 tools 45 tools
Shared Tools 43 (100%) 43 (96%)
Exclusive Tools 0 2 (Asset Management)
Basic DAG Operations โœ… โœ… Enhanced
Task Management โœ… โœ… Enhanced
Connection Management โœ… โœ… Enhanced
Pool Management โœ… โœ… Enhanced
Asset Management โŒ โœ… New
Asset Events โŒ โœ… New
Data-Aware Scheduling โŒ โœ… New
Enhanced DAG Warnings โŒ โœ… New
Advanced Filtering Basic โœ… Enhanced

๐Ÿ” Security & Authentication

Bearer Token Authentication

For streamable-http mode, this MCP server supports Bearer token authentication to secure remote access. This is especially important when running the server in production environments.

Configuration

Enable Authentication:

In .env file

REMOTE_AUTH_ENABLE=true REMOTE_SECRET_KEY=your-secure-secret-key-here

Or via CLI:

python -m mcp_airflow_api --type streamable-http --auth-enable --secret-key your-secure-secret-key-here

Security Levels

  1. stdio mode (Default): Local-only access, no authentication needed
  2. streamable-http + REMOTE_AUTH_ENABLE=false: Remote access without authentication โš ๏ธ NOT RECOMMENDED for production
  3. streamable-http + REMOTE_AUTH_ENABLE=true: Remote access with Bearer token authentication โœ… RECOMMENDED for production

Note: REMOTE_AUTH_ENABLE defaults to false when undefined, empty, or null. Supported values are true/false, 1/0, yes/no, on/off (case insensitive).

Client Configuration

When authentication is enabled, MCP clients must include the Bearer token in the Authorization header:

{ "mcpServers": { "mcp-airflow-api": { "type": "streamable-http", "url": "http://your-server:8000/mcp", "headers": { "Authorization": "Bearer your-secure-secret-key-here" } } } }

Security Best Practices

Error Handling

When authentication fails, the server returns:


Custom Docker Compose Setup

version: '3.8' services: mcp-server: build: context: . dockerfile: Dockerfile.MCP-Server environment: - FASTMCP_PORT=8000 - AIRFLOW_API_VERSION=v1 - AIRFLOW_API_BASE_URL=http://your-airflow:8080/api - AIRFLOW_API_USERNAME=airflow - AIRFLOW_API_PASSWORD=airflow

Development Installation

git clone https://github.com/call518/MCP-Airflow-API.git cd MCP-Airflow-API pip install -e .

Run in stdio mode

python -m mcp_airflow_api


๐Ÿงช Test Airflow Cluster Deployment

For testing and development, use our companion project Airflow-Docker-Compose which supports both Airflow 2.x and 3.x environments.

Quick Setup

  1. Clone the test environment repository:
    git clone https://github.com/call518/Airflow-Docker-Compose.git
    cd Airflow-Docker-Compose

Option 1: Deploy Airflow 2.x (LTS)

For testing API v1 compatibility with stable production features:

Navigate to Airflow 2.x environment

cd airflow-2.x

(Optional) Customize environment variables

cp .env.template .env

Edit .env file as needed

Deploy Airflow 2.x cluster

./run-airflow-cluster.sh

Access Web UI

URL: http://localhost:38080

Username: airflow / Password: airflow

Environment details:

Option 2: Deploy Airflow 3.x (Latest)

For testing API v2 with latest features including Assets management:

Navigate to Airflow 3.x environment

cd airflow-3.x

(Optional) Customize environment variables

cp .env.template .env

Edit .env file as needed

Deploy Airflow 3.x cluster

./run-airflow-cluster.sh

Access API Server

URL: http://localhost:48080

Username: airflow / Password: airflow

Environment details:

Option 3: Deploy Both Versions Simultaneously

For comprehensive testing across different Airflow versions:

Start Airflow 2.x (port 38080)

cd airflow-2.x && ./run-airflow-cluster.sh

Start Airflow 3.x (port 48080)

cd ../airflow-3.x && ./run-airflow-cluster.sh

Key Differences

Feature Airflow 2.x Airflow 3.x
Authentication Basic Auth JWT Tokens (FabAuthManager)
Default Port 38080 48080
API Endpoints /api/v1/* /api/v2/*
Assets Support โŒ Limited/Experimental โœ… Full Support
Provider Packages providers distributions
Stability โœ… Production Ready ๐Ÿงช Beta/Development

Cleanup

To stop and clean up the test environments:

For Airflow 2.x

cd airflow-2.x && ./cleanup-airflow-cluster.sh

For Airflow 3.x

cd airflow-3.x && ./cleanup-airflow-cluster.sh


๐ŸŒˆ Future-Ready Architecture


๐ŸŽฏ Who Is This Tool For?


๐Ÿš€ Open Source Contribution and Community

Repository: https://github.com/call518/MCP-Airflow-API

How to Contribute

Please consider starring the project if you find it useful.


๐Ÿ”ฎ Conclusion

MCP-Airflow-API changes the paradigm of data engineering and workflow management:
No need to memorize REST API calls โ€” just ask in natural language:

"Show me the status of currently running ETL jobs."


๐Ÿท๏ธ Tags

#Apache-Airflow #MCP #ModelContextProtocol #DataEngineering #DevOps #WorkflowAutomation #NaturalLanguage #OpenSource #Python #Docker #AI-Integration


๐Ÿ“š Example Queries & Use Cases

This section provides comprehensive examples of how to use MCP-Airflow-API tools with natural language queries.

Basic DAG Operations

Cluster Management & Health

Pool Management

Variable Management

Task Instance Management

XCom Management

Configuration Management

Important: Configuration tools require expose_config = True in airflow.cfg [webserver] section. Even admin users get 403 errors if this is disabled.

DAG Analysis & Monitoring

Note: get_dags_detailed_batch returns each DAG with both configuration details (from get_dag()) and a latest_dag_run field containing the most recent execution information (run_id, state, execution_date, start_date, end_date, etc.).

Date Calculation Examples

Tools automatically base relative date calculations on the server's current date/time:

User Input Calculation Method Example Format
"yesterday" current_date - 1 day YYYY-MM-DD (1 day before current)
"last week" current_date - 7 days to current_date - 1 day YYYY-MM-DD to YYYY-MM-DD (7 days range)
"last 3 days" current_date - 3 days to current_date YYYY-MM-DD to YYYY-MM-DD (3 days range)
"this morning" current_date 00:00 to 12:00 YYYY-MM-DDTHH:mm:ssZ format

The server always uses its current date/time for these calculations.

Asset Management (API v2 Only)

Available only when AIRFLOW_API_VERSION=v2 (Airflow 3.0+):

Data-Aware Scheduling Examples:


Contributing

๐Ÿค Got ideas? Found bugs? Want to add cool features?

We're always excited to welcome new contributors! Whether you're fixing a typo, adding a new monitoring tool, or improving documentation - every contribution makes this project better.

Ways to contribute:

Pro tip: The codebase is designed to be super friendly for adding new tools. Check out the existing @mcp.tool() functions in airflow_api.py.


๐Ÿ› ๏ธ Adding Custom Tools (Advanced)

This MCP server is designed for easy extensibility. After you have explored the main features and Quickstart, you can add your own custom tools as follows:

Step-by-Step Guide

1. Add Helper Functions (Optional)

Add reusable data functions to src/mcp_airflow_api/functions.py:

async def get_your_custom_data(target_resource: str = None) -> List[Dict[str, Any]]: """Your custom data retrieval function."""

Example implementation - adapt to your service

data_source = await get_data_connection(target_resource) results = await fetch_data_from_source( source=data_source, filters=your_conditions, aggregations=["count", "sum", "avg"], sorting=["count DESC", "timestamp ASC"] ) return results

2. Create Your MCP Tool

Add your tool function to src/mcp_airflow_api/airflow_api.py:

@mcp.tool() async def get_your_custom_analysis(limit: int = 50, target_name: Optional[str] = None) -> str: """ [Tool Purpose]: Brief description of what your tool does

[Exact Functionality]:

[Required Use Cases]:

Args: limit: Maximum results (1-100) target_name: Target resource/service name

Returns: Formatted analysis results """ try: limit = max(1, min(limit, 100)) # Always validate input results = await get_your_custom_data(target_resource=target_name) if results: results = results[:limit] return format_table_data(results, f"Custom Analysis (Top {len(results)})") except Exception as e: logger.error(f"Failed to get custom analysis: {e}") return f"Error: {str(e)}"

3. Update Imports (If Needed)

Add your helper function to imports in src/mcp_airflow_api/airflow_api.py:

from .functions import (

...existing imports...

get_your_custom_data, # Add your new function )

Add your tool description to src/mcp_airflow_api/prompt_template.md for better natural language recognition:

Your Custom Analysis Tool

X. get_your_custom_analysis

Purpose: Brief description of what your tool does Usage: "Show me your custom analysis" or "Get custom analysis for database_name" Features: Data aggregation, resource monitoring, performance metrics Required: target_name parameter for specific resource analysis

5. Test Your Tool

Local testing

./scripts/run-mcp-inspector-local.sh

Or with Docker

docker-compose up -d docker-compose logs -f mcp-server

Test with natural language:

"Show me your custom analysis"

"Get custom analysis for target_name"

That's it! Your custom tool is ready to use with natural language queries.

License

Freely use, modify, and distribute under the MIT License.