Document Processing MCP Repositories
326 repositories in this category.
highlight-youtube-mcp
→
Extract transcripts from YouTube videos by providing a video URL. The server supports multiple URL formats and returns the transcript text in a structured array format.
mcp-server-firecrawl
→
Provides capabilities for web scraping, intelligent content searching, and site crawling using the Firecrawl API, facilitating customizable data extraction and structured output.
md-webcrawl-mcp
→
Extracts website content and saves it as markdown files while mapping website structures and links efficiently, enabling batch processing of multiple URLs.
mcp-framework
→
This framework enables the creation of custom tools for interaction with large language models, facilitating web content retrieval and various file handling capabilities. It automates the processing of PDF, Word, and Excel documents for enhanced productivity.
MCP-Servers
→
Access and manage Microsoft OneNote content, enabling the reading and creation of notebooks, sections, and pages directly through AI assistants. Converts HTML content to text for improved retrieval-augmented generation (RAG) processing.
ChatPPT-MCP
→
AI-powered service for generating PowerPoint presentations based on themes or uploaded documents, with features for online editing and downloading final outputs.
mcp-units
→
Provides tools for converting cooking measurements between various volume, weight, and temperature units commonly used in cooking, such as milliliters to cups and grams to pounds.
LPS-MCP
→
Proporciona acceso seguro al sistema de archivos y capacidades de pensamiento secuencial para mejorar la interacción de Claude con su entorno. Permite desglosar problemas complejos en pasos estructurados y acceder a archivos de manera controlada.
Feishu-MCP
→
Manage and manipulate Feishu documents with capabilities for creating, editing, and extracting structured and unstructured content, along with rich text formatting and code block handling.
confluence-wiki-mcp-server-extension
→
Integrate Confluence Wiki content with AI models for enhanced analysis and interaction. Convert Wiki content to Markdown format and securely manage access to your Wiki data through an easy configuration interface.
kv-extractor-mcp-server
→
Extracts key-value pairs from noisy or unstructured text in multiple languages, ensuring type-safe outputs in JSON, YAML, or TOML formats. Utilizes advanced LLMs and pydantic for data structuring and validation, supporting languages like Japanese, English, and Chinese.
ppt_se
→
The PowerPoint Presentation Automation Server allows users to easily create and edit PowerPoint presentations using Python. It streamlines the process of generating slides and incorporating various elements like text, images, and charts, making it accessible for AI models and other applications.
quill
→
A rich text editor that enables the creation and manipulation of formatted text content in web applications, supporting various styles and formats. It provides an intuitive interface and robust features for enhanced user engagement.
agentic-pdf-app
→
Automatically fills California court PDF forms by extracting and mapping data from donor documents through AI analysis. Features a minimalist interface and a modular microservices architecture for easy deployment using Docker.
MCP-Websearch-Server
→
Fetches relevant documentation snippets from Langchain, Llama Index, and OpenAI to enhance search capabilities. Provides a simple tool to retrieve information based on user queries.
hwp-mcp
→
Control and manage HWP (Hangul Word Processor) documents by creating, editing, and automating tasks through AI models. It offers features like text editing, table manipulation, and batch processing of documents.
mcp-storage-server
→
Facilitates secure storage and retrieval of files using decentralized storage via IPFS and CIDs. Enables verifiable data exchange and integration with AI frameworks, offering free storage options to users.
mcp-server-fetch-typescript
→
Retrieves and converts web content using various formats and rendering methods, suitable for both data extraction and web scraping tasks. It allows access to text-based resources and provides raw text content from specified URLs without additional processing.
package-documentation-mcp
→
Fetches npm package documentation from multiple programming ecosystems and presents it for use with LLMs, such as Claude, without the need for API keys.
ntealan-apis-mcp-server
→
Manage dictionary data, articles, and user contributions through a modular and extensible interface. Supports asynchronous operations for efficient integration with NTeALan REST APIs.
mcp-video-converter
→
Convert video, audio, and image files between various formats using FFmpeg. Check for FFmpeg installation and retrieve information on supported file formats for conversion.
cosense-mcp-server
→
Access and interact with the Cosense knowledge sharing platform by retrieving, listing, and searching for pages, as well as inserting text into existing pages.
mcp-docs-service
→
Manage markdown documentation by creating, reading, updating, and deleting files while analyzing their health and improving quality. Enhance AI assistants' interactions with documentation through natural language processing capabilities.
claudekeep
→
A server implementation that enables the saving and sharing of AI conversations from Claude Desktop, featuring both a private chat storage and a public chat display web app. This implementation utilizes the Model Context Protocol (MCP) to manage interactions with AI chat logs.
mcp-media-processor
→
A Node.js server for executing various media processing tasks, including video and image manipulation. It supports operations like video conversion, image effects, and media compression.
meeting-mcp
→
Manage meeting data including transcripts, recordings, and calendar events while providing search functionality for easy organization and retrieval.
TranscriptionTools-MCP
→
Enhances transcription workflows by automatically repairing errors, formatting transcripts naturally, and generating concise summaries. Utilizes advanced language models for intelligent processing of audio transcripts.
esa-mcp-server
→
Integrate Claude AI with the esa API to manage documents efficiently by performing operations such as searching, creating, and updating documents.
Deepseek_chat_rag
→
Utilizes advanced retrieval-augmented generation models to answer queries based on indexed documents extracted from various file formats. Engages users by providing relevant answers from a Chroma database that stores extracted text from PDF, DOCX, TXT, and CSV files.
mcp-server-ietf
→
Access and retrieve IETF RFC documents, enabling search by keywords and management of document pagination. Provides standardized access to essential specifications for Large Language Models.
