Document Processing MCP Repositories
326 repositories in this category.
Deep Research Mcp
→
Provides advanced web search capabilities, document analysis, and image processing. Extracts information from various sources including PDFs and YouTube transcripts efficiently.
Handwriting Ocr Mcp Server
→
Integrate applications with the Handwriting OCR service to process images and PDF documents for text extraction. Upload documents, check processing status, and retrieve OCR results in Markdown format.
Yuque Mcp Server
→
Integrate with the Yuque API for managing documents and user information. Supports creating, reading, updating, and deleting documents while providing access to analytics and statistics for knowledge bases.
Mcp Server Novacv
→
Connect to the NovaCV API for generating professional resumes, analyzing resume content, and converting resume text into structured formats like JSON. It provides features for creating tailored resumes in PDF format and accessing available template options.
Markdownify Mcp Utf8
→
Converts various file types to Markdown format, with robust support for UTF-8 encoding and optimized for multilingual content handling. Ensures accurate transformation of documents and web pages while addressing encoding issues, especially on Windows systems.
Semanticscholar Mcp Server
→
Search for academic papers, retrieve detailed information about specific papers and authors, and access citations and references through the Semantic Scholar API.
Figma Mcp
→
Facilitates access to Figma files and prototypes, enabling integration of design assets directly into AI coding environments. Streamlines design workflows by connecting AI agents with Figma's design resources.
Mcp Jina Ai
→
Access Jina AI's web services for web page reading, web search, and fact checking. Extract and format content from web pages for use with LLMs.
Mcp Accessibility Scanner
→
Automated web accessibility scanning using Playwright and Axe-core, enabling WCAG compliance checks and annotated screenshot capture. Generates detailed accessibility reports and interacts with web pages through browser automation.
Klavis
→
Generates visually appealing web reports based on simple search queries, integrating live web search results and storing reports in a database for easy access. Utilizes AI to synthesize information into interactive HTML formats.
Markitdown_mcp_server
→
Converts various file formats to Markdown, utilizing the MarkItDown utility to handle documents, images, and audio files.
Mcp Bibliotheque_nationale_de_france
→
Access the Gallica digital library to search for documents, images, maps, and other resources, and generate structured research reports that include organized bibliographies and relevant visual content.
Kv Extractor Mcp Server
→
Extracts key-value pairs from noisy or unstructured text in multiple languages, ensuring type-safe outputs in JSON, YAML, or TOML formats. Utilizes advanced LLMs and pydantic for data structuring and validation, supporting languages like Japanese, English, and Chinese.
Mcp Webdav Server
→
Enable natural language interaction with WebDAV file systems to perform CRUD operations on files and directories through a secure and configurable MCP server. Supports connections with optional authentication and efficient management of file operations via multiple transport methods.
Eigenlayer Mcp Server
→
Provides detailed EigenLayer documentation to AI assistants through a dedicated server interface, enabling seamless integration and querying of EigenLayer concepts and mechanisms.
Ntealan Apis Mcp Server
→
Manage dictionary data, articles, and user contributions through a modular and extensible interface. Supports asynchronous operations for efficient integration with NTeALan REST APIs.
Cosa Sai
→
Access documentation for a variety of technologies through the Gemini API, leveraging a curated knowledge base to provide accurate responses to complex queries. This server is designed to handle large context windows for improved comprehension of technical materials.
Unstructured Mcp
→
Enable extraction and utilization of content from various unstructured document formats, supporting seamless storage and retrieval via AWS S3. Process documents directly in applications to enhance data extraction capabilities for LLMs.
File Converter Mcp
→
Convert documents between various formats using Pandoc, enabling seamless integration and automation in workflows. Supports a wide range of formats including Markdown, DOCX, HTML, PDF, and EPUB.
Meeting Mcp
→
Manage meeting data including transcripts, recordings, and calendar events while providing search functionality for easy organization and retrieval.
Docs2prompt Mcp
→
Transforms documentation from GitHub repositories or dedicated websites into LLM-friendly prompts for enhanced context and understanding in AI applications.
Mcp Doc
→
Create, edit, and manage Word documents using natural language commands, facilitating document operations and formatting. Support for table processing, image insertion, and layout control is also included.
Mcp Japanesetextanalyzer
→
Analyzes Japanese and English texts by counting characters and words and evaluating linguistic features such as average sentence length and lexical diversity. Supports input via file paths or direct text input, accommodating both absolute and relative paths.
Mcp Server Firecrawl
→
Provides capabilities for web scraping, intelligent content searching, and site crawling using the Firecrawl API, facilitating customizable data extraction and structured output.
Transcriptiontools Mcp
→
Enhances transcription workflows by automatically repairing errors, formatting transcripts naturally, and generating concise summaries. Utilizes advanced language models for intelligent processing of audio transcripts.
Textclassifier
→
Multiple common text classification models based on CNN, RNN, and pre-trained NLP architectures for sentiment analysis and text classification. Supports data preprocessing, training word embeddings, and implementing advanced models like Bi-LSTM, Transformer, ELMo, and BERT for improved classification accuracy.
Mcp Sefaria Server
→
Access and reference Jewish texts and commentaries through a standardized interface.
Autodocument
→
Generates comprehensive documentation, test plans, and code reviews by analyzing code repositories and directory structures. Utilizes AI to enhance development workflows with detailed insights into security and best practices.
Gemforge Mcp
→
Provides tools for interacting with Google's Gemini AI models, enabling intelligent model selection and advanced file handling. Facilitates AI tasks such as search, reasoning, code analysis, and file operations through a standardized MCP server interface.
Mcp Webresearch Stealthified
→
Connects AI models to the web for real-time information retrieval, webpage content extraction, and research session tracking, along with the ability to capture screenshots.