GitHub - Unstructured-IO/UNS-MCP (original) (raw)

Unstructured API MCP Server

An MCP server implementation for interacting with the Unstructured API. This server provides tools to list sources and workflows.

Available Tools

Tool Description
list_sources Lists available sources from the Unstructured API.
get_source_info Get detailed information about a specific source connector.
create_source_connector Create a source connector.)
update_source_connector Update an existing source connector by params.
delete_source_connector Delete a source connector by source id.
list_destinations Lists available destinations from the Unstructured API.
get_destination_info Get detailed info about a specific destination connector
create_destination_connector Create a destination connector by params.
update_destination_connector Update an existing destination connector by destination id.
delete_destination_connector Delete a destination connector by destination id.
list_workflows Lists workflows from the Unstructured API.
get_workflow_info Get detailed information about a specific workflow.
create_workflow Create a new workflow with source, destination id, etc.
run_workflow Run a specific workflow with workflow id
update_workflow Update an existing workflow by params.
delete_workflow Delete a specific workflow by id.
list_jobs Lists jobs for a specific workflow from the Unstructured API.
get_job_info Get detailed information about a specific job by job id.
cancel_job Delete a specific job by id.
list_workflows_with_finished_jobs Lists all workflows that have any completed job, together with information about source and destination details.

Below is a list of connectors the UNS-MCP server currently supports, please see the full list of source connectors that Unstructured platform supports here and destination list here. We are planning on adding more!

Source Destination
S3 S3
Azure Weaviate
Google Drive Pinecone
OneDrive AstraDB
Salesforce MongoDB
Sharepoint Neo4j
Databricks Volumes
Databricks Volumes Delta Table

To use the tool that creates/updates/deletes a connector, the credentials for that specific connector must be defined in your .env file. Below is the list of credentials for the connectors we support:

Credential Name Description
ANTHROPIC_API_KEY required to run the minimal_client to interact with our server.
AWS_KEY, AWS_SECRET required to create S3 connector via uns-mcp server, see how in documentation and here
WEAVIATE_CLOUD_API_KEY required to create Weaviate vector db connector, see how in documentation
FIRECRAWL_API_KEY required to use Firecrawl tools in external/firecrawl.py, sign up on Firecrawl and get an API key.
ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT required to create Astradb connector via uns-mcp server, see how in documentation
AZURE_CONNECTION_STRING required option 1 to create Azure connector via uns-mcp server, see how in documentation
AZURE_ACCOUNT_NAME+AZURE_ACCOUNT_KEY required option 2 to create Azure connector via uns-mcp server, see how in documentation
AZURE_ACCOUNT_NAME+AZURE_SAS_TOKEN required option 3 to create Azure connector via uns-mcp server, see how in documentation
NEO4J_PASSWORD required to create Neo4j connector via uns-mcp server, see how in documentation
MONGO_DB_CONNECTION_STRING required to create Mongodb connector via uns-mcp server, see how in documentation
GOOGLEDRIVE_SERVICE_ACCOUNT_KEY a string value. The original server account key (follow documentation) is in json file, run base64 < /path/to/google_service_account_key.json in terminal to get the string value
DATABRICKS_CLIENT_ID,DATABRICKS_CLIENT_SECRET required to create Databricks volume/delta table connector via uns-mcp server, see how in documentation and here
ONEDRIVE_CLIENT_ID, ONEDRIVE_CLIENT_CRED,ONEDRIVE_TENANT_ID required to create One Drive connector via uns-mcp server, see how in documentation
PINECONE_API_KEY required to create Pinecone vector DB connector via uns-mcp server, see how in documentation
SALESFORCE_CONSUMER_KEY,SALESFORCE_PRIVATE_KEY required to create salesforce source connector via uns-mcp server, see how in documentation
SHAREPOINT_CLIENT_ID, SHAREPOINT_CLIENT_CRED,SHAREPOINT_TENANT_ID required to create One Drive connector via uns-mcp server, see how in documentation
LOG_LEVEL Used to set logging level for our minimal_client, e.g. set to ERROR to get everything
CONFIRM_TOOL_USE set to true so that minimal_client can confirm execution before each tool call
DEBUG_API_REQUESTS set to true so that uns_mcp/server.py can output request parameters for better debugging

Firecrawl Source

Firecrawl is a web crawling API that provides two main capabilities in our MCP:

  1. HTML Content Retrieval: Using invoke_firecrawl_crawlhtml to start crawl jobs and check_crawlhtml_status to monitor them
  2. LLM-Optimized Text Generation: Using invoke_firecrawl_llmtxt to generate text and check_llmtxt_status to retrieve results

How Firecrawl works:

Web Crawling Process:

LLM Text Generation:

Note: A FIRECRAWL_API_KEY environment variable must be set to use these functions.

Installation & Configuration

This guide provides step-by-step instructions to set up and configure the UNS_MCP server using Python 3.12 and the uv tool.

Prerequisites

No additional installation is required when using uvx as it handles execution. However, if you prefer to install the package directly:

Configure Claude Desktop

For integration with Claude Desktop, add the following content to your claude_desktop_config.json:

Note: The file is located in the ~/Library/Application Support/Claude/ directory.

Using uvx Command:

{ "mcpServers": { "UNS_MCP": { "command": "uvx", "args": ["uns_mcp"], "env": { "UNSTRUCTURED_API_KEY": "" } } } }

Alternatively, Using Python Package:

{ "mcpServers": { "UNS_MCP": { "command": "python", "args": ["-m", "uns_mcp"], "env": { "UNSTRUCTURED_API_KEY": "" } } } }

Using Source Code

  1. Clone the repository.
  2. Install dependencies:
  3. Set your Unstructured API key as an environment variable. Create a .env file in the root directory with the following content:
    UNSTRUCTURED_API_KEY="YOUR_KEY"
    Refer to .env.template for the configurable environment variables.

You can now run the server using one of the following methods:

Using Editable Package InstallationInstall as an editable package:

Update your Claude Desktop config:

{ "mcpServers": { "UNS_MCP": { "command": "uvx", "args": ["uns_mcp"] } } }

Note: Remember to point to the uvx executable in environment where you installed the package

Using SSE Server Protocol

Note: Not supported by Claude Desktop.

For SSE protocol, you can debug more easily by decoupling the client and server:

  1. Start the server in one terminal:
    uv run python uns_mcp/server.py --host 127.0.0.1 --port 8080

or

make sse-server 2. Test the server using a local client in another terminal:
uv run python minimal_client/client.py "http://127.0.0.1:8080/sse"

or

make sse-client

Note: To stop the services, use Ctrl+C on the client first, then the server.

Using Stdio Server Protocol

Configure Claude Desktop to use stdio:

{ "mcpServers": { "UNS_MCP": { "command": "ABSOLUTE/PATH/TO/.local/bin/uv", "args": [ "--directory", "ABSOLUTE/PATH/TO/YOUR-UNS-MCP-REPO/uns_mcp", "run", "server.py" ] } } }

Alternatively, run the local client:

uv run python minimal_client/client.py uns_mcp/server.py

Additional Local Client Configuration

Configure the minimal client using environmental variables:

Debugging tools

Anthropic provides MCP Inspector tool to debug/test your MCP server. Run the following command to spin up a debugging UI. From there, you will be able to add environment variables (pointing to your local env) on the left pane. Include your personal API key there as env var. Go to tools, you can test out the capabilities you add to the MCP server.

mcp dev uns_mcp/server.py

If you need to log request call parameters to UnstructuredClient, set the environment variable DEBUG_API_REQUESTS=false. The logs are stored in a file with the format unstructured-client-{date}.log, which can be examined to debug request call parameters to UnstructuredClient functions.

Add terminal access to minimal client

We are going to use @wonderwhy-er/desktop-commander to add terminal access to the minimal client. It is built on the MCP Filesystem Server. Be careful, as the client (also LLM) now has access to private files.

Execute the following command to install the package:

npx @wonderwhy-er/desktop-commander setup

Then start client with extra parameter:

uv run python minimal_client/client.py "http://127.0.0.1:8080/sse" "@wonderwhy-er/desktop-commander"

or

make sse-client-terminal

Using subset of tools

If your client supports using only subset of tools here are the list of things you should be aware:

Known issues

CHANGELOG.md

Any new developed features/fixes/enhancements will be added to CHANGELOG.md. 0.x.x-dev pre-release format is preferred before we bump to a stable version.

Troubleshooting