Unstructured Mcp
Enable extraction and utilization of content from various unstructured document formats, supporting seamless storage and retrieval via AWS S3. Process documents directly in applications to enhance data extraction capabilities for LLMs.
Author

MKhalusova
No License
Quick Info
Tools 1
Last Updated 22/5/2025
Actions
Tags
unstructured documents s3 unstructured document unstructured mcp document processing
A Model Context Protocol server that provides unstructured document processing capabilities. This server enables LLMs to extract and use content from an unstructured document.
This repo is work in progress, proceed with caution :)
Supported file types:
{".abw", ".bmp", ".csv", ".cwk", ".dbf", ".dif", ".doc", ".docm", ".docx", ".dot",
".dotm", ".eml", ".epub", ".et", ".eth", ".fods", ".gif", ".heic", ".htm", ".html",
".hwp", ".jpeg", ".jpg", ".md", ".mcw", ".mw", ".odt", ".org", ".p7s", ".pages",
".pbd", ".pdf", ".png", ".pot", ".potm", ".ppt", ".pptm", ".pptx", ".prn", ".rst",
".rtf", ".sdp", ".sgl", ".svg", ".sxg", ".tiff", ".txt", ".tsv", ".uof", ".uos1",
".uos2", ".web", ".webp", ".wk2", ".xls", ".xlsb", ".xlsm", ".xlsx", ".xlw", ".xml",
".zabw"}
Prerequisites: You'll need:
- Unstructured API key. Learn how to obtain one here
- Claude Desktop installed locally
Quick TLDR on how to add this MCP to your Claude Desktop:
- Clone the repo and set up the UV environment.
- Create a
.env
file in the root directory and add the following env variable:UNSTRUCTURED_API_KEY
. - Run the MCP server:
uv run doc_processor.py
- Go to
~/Library/Application Support/Claude/
and create aclaude_desktop_config.json
. In that file add:
{
"mcpServers": {
"unstructured_doc_processor": {
"command": "PATH/TO/YOUR/UV",
"args": [
"--directory",
"ABSOLUTE/PATH/TO/YOUR/unstructured-mcp/",
"run",
"doc_processor.py"
],
"disabled": false
}
}
}
- Restart Claude Desktop. You should now be able to use the MCP.