Document Processing MCP Repositories
326 repositories in this category.
mcp-file-merger
→
Combine multiple files into a single file efficiently, providing detailed reports on file sizes and merge summaries. Access is restricted to user-defined directories for security.
markitdown
→
Converts various file formats to Markdown, facilitating integration with LLM applications and enabling text analysis pipelines while preserving document structure. Includes features for audio transcription and document intelligence to enhance data processing capabilities.
mcp-url-fetcher
→
Fetch and transform web content from any URL into formats like HTML, JSON, Markdown, or plain text. This MCP server supports various input types and intelligently detects source formats for seamless content conversion.
sanity-mcp-server
→
Manage Sanity.io content by creating, editing, and listing documents within an LLM interface, facilitating smooth content workflow and integration with Claude Desktop.
dart-mcp-server
→
Manage tasks, documents, and workspaces seamlessly with capabilities for task creation, updates, priority settings, and team assignments, along with document handling features.
youtube_transcriptor
→
Transcribes YouTube videos by extracting transcripts, including both manual and autogenerated captions, using the provided video URL. Supports integration with MCP clients for enhanced workflows involving video transcription.
chatterboxio-mcp-server
→
Integrates with online meeting platforms like Zoom and Google Meet to facilitate AI agents joining meetings, capturing transcripts, and generating concise summaries of discussions.
mcp-server-esignatures
→
Manage contracts and templates by drafting, querying, withdrawing, and deleting contracts efficiently.
skrape-mcp
→
Convert web pages into clean, structured Markdown suitable for large language model (LLM) consumption, streamlining the process of feeding web content into AI applications.
prem-mcp-server
→
Integrates with Prem AI's features for chat interactions and document management, supporting Retrieval-Augmented Generation with document repositories and real-time streaming responses.
docs2prompt-mcp
→
Transforms documentation from GitHub repositories or dedicated websites into LLM-friendly prompts for enhanced context and understanding in AI applications.
minima
→
Minima is an open source RAG server that operates on-premises, allowing integration with ChatGPT and MCP. It provides secure local storage and processing of data while supporting querying of local documents through customizable GPT interfaces.
markai
→
MarkAI is a platform that enables users to ask questions and receive answers derived from their documents, providing efficient data access. It supports various file formats and offers both public and private collaboration options.
Medical-report-analyzer
→
Analyze medical reports and symptoms to gain health insights and suggestions, providing detailed medicine information tailored to individual needs with bilingual support in English and Bengali.
mcp-terragrunt-docs
→
Provides access to up-to-date Terragrunt documentation and GitHub issue information for enhanced infrastructure-as-code development. Enables contextual querying and assistance for AI workflows or IDE integrations.
mcp-docling
→
Convert documents to markdown, extract tables, and process multiple files efficiently for enhanced document processing capabilities.
mcp-jina-ai
→
Access Jina AI's web services for web page reading, web search, and fact checking. Extract and format content from web pages for use with LLMs.
markitdown_mcp_server
→
Converts various file formats to Markdown, utilizing the MarkItDown utility to handle documents, images, and audio files.
mcp-server-box
→
Integrate with the Box API to perform file operations, including file search, text extraction, and AI-based querying. Manage and process Box data efficiently with advanced AI capabilities.
mcp-server
→
Automates the collection of project information from GitHub for students, assisting in resume writing, interview question generation, and portfolio management. Provides tools for project-based self-introduction and interview practice to streamline career preparation.
macos-ocr-mcp
→
Perform Optical Character Recognition (OCR) on images with the help of macOS's Vision framework, extracting recognized text segments, confidence scores, and bounding box coordinates. Suitable for applications that require text extraction from image files.
obsidian-mcp
→
Interact with an Obsidian vault to read, write, and manipulate notes using a standardized interface, facilitating enhanced productivity and organization.
mcp-video-digest
→
Extract audio from various video platforms like YouTube and TikTok, and convert the audio to text using multiple transcription services. Supports asynchronous processing and speaker separation for enhanced video content analysis.
mcp-doc
→
Integrates LLM applications with specific documentation sources, enabling access and retrieval of documentation files to enhance knowledge and responses. Provides tools for fetching documentation from specified URLs within those files.
medical-coding-reproducibility
→
Automates the process of assigning diagnosis and procedure codes from electronic health records. Utilizes advanced models to improve accuracy and efficiency in medical coding tasks, with tools and datasets from MIMIC-III and MIMIC-IV.
file-converter-mcp
→
Convert various document and image formats such as DOCX to PDF, PDF to DOCX, and multiple image formats (JPG, PNG, WebP, etc.). Provides reliable and flexible file handling to meet diverse conversion needs.
mcp-framework
→
Create custom tools to interact with large language models, facilitating web content fetching and processing of various document formats including PDF, Word, and Excel. Supports advanced features such as OCR for image content in documents and enhances workflow automation.
Append-Data-to-JSON-File-and-Display-JSON-data-to-HTML-Table-using-Ajax-Jquery-getJSON-method
→
Append data to a JSON file and display it in an HTML table using Ajax and jQuery's getJSON method. This enables dynamic data loading for web applications, enhancing data management and user experience.
excel-mcp-server
→
Read and write data in Microsoft Excel files, including text values and formulas. It supports creating new sheets and offers live editing and screen capture functionalities on Windows.
mcp-server-novacv
→
Connect to the NovaCV API for generating professional resumes, analyzing resume content, and converting resume text into structured formats like JSON. It provides features for creating tailored resumes in PDF format and accessing available template options.
