Document Processing MCP Repositories
326 repositories in this category.
notion-mcp-server
→
Query and manipulate Notion Pages by creating, reading, and updating content directly from prompts. Seamlessly manage Notion databases and enhance productivity through integration.
finance_news_analysis
→
Scrapes financial data, performs NLP algorithm analysis, and facilitates quantitative strategy backtesting. Integrates various components for automated financial data processing and insights extraction.
mcp-doc-scraper
→
Scrapes documentation from web URLs and converts it into markdown format, saving the converted documentation to a specified output path. Integrates with the Model Context Protocol (MCP) for enhanced data management.
scrapbox-cosense-mcp
→
Access and interact with Scrapbox project pages, facilitating content retrieval, page listing, and full-text searching across project content.
zoom_transcript_mcp
→
Manage Zoom meeting transcripts by listing, downloading, and searching through them with a structured interface. Organize transcripts by month for streamlined access to discussions.
mcp-excel-reader-server
→
Extract data from Excel files in structured JSON format, allowing access to all sheets or specific sheets by name or index. Handles data type conversions and manages empty cells efficiently.
lance-mcp
→
Interact with on-disk documents through retrieval-augmented generation (RAG) and hybrid search capabilities in LanceDB.
mcp-server-diff-python
→
Obtain text differences between two strings using Python's `difflib`, providing output in Unified diff format suitable for text comparison and version control.
excel-reader-mcp
→
Read Excel files with support for automatic chunking and pagination, enabling efficient data handling for large datasets. This server can process multiple sheet selections and provides proper handling of date formats.
zotero-mcp
→
Access and manage your Zotero library through a Model Context Protocol server, enabling interactions with AI assistants. It provides a focused set of functionalities to streamline library management and integration.
paperless-mcp
→
Manage documents, tags, correspondents, and document types through the Paperless-NGX API. Enables efficient organization and retrieval of document-related information.
document-qa
→
A Streamlit app for answering questions about uploaded documents using GPT-3.5, enabling users to extract information quickly and efficiently. Ideal for enhancing productivity in document analysis.
markitdown_mcp_server
→
Convert various file formats to Markdown using the MarkItDown utility. Process PDFs, Office documents, images, audio files, HTML, and more into a Markdown format for streamlined content handling.
base64_server
→
Provides efficient Base64 encoding and decoding services for both text and images, including support for Data URL formats. Features a simple API for easy integration and reusable prompt templates to simplify Base64 transformations in applications.
AutoGuarantee
→
自动提取保函文本中的要素和条款,提供法律和金融专业人士分析所需的信息。输出结果为 JSON 格式,支持提取担保人的 SWIFT 标识代码、开立日期和保函种类等要素。
reference-mcp
→
Retrieve BibTeX-formatted citation data from CiteAs and Google Scholar to streamline citation management in research applications.
mcp-documentation-server
→
AI-assisted management of documentation and code improvement, supporting various frameworks with smart search capabilities. Integrates with Claude Desktop for an enhanced coding experience and improves suggestions over time.
ifly-spark-agent-mcp
→
Invokes the task chain of the iFlytek SparkAgent Platform through an MCP server interface, allowing users to upload files and interact with platform capabilities. Enables integration with AI models for automated workflows and task execution.
mcp-server-obsidian-jsoncanvas
→
Create, modify, and validate infinite canvas data structures using a comprehensive set of tools that manage nodes and edges while ensuring compliance with the official JSON Canvas specification.
documind-mcp-server
→
Analyzes and enhances the quality of documentation, specifically README files, by providing insights and suggestions for improvement. Utilizes advanced neural processing techniques for thorough evaluation and visual analysis of documentation elements.
tavily-mcp
→
Integrates real-time web search and intelligent data extraction to enhance AI assistants with up-to-date information and sophisticated filtering capabilities. Supports domain-specific data retrieval and processing features for improved AI workflows.
YTTranscipterMultilingualMCP
→
Transcribes YouTube videos into text across multiple languages to enhance content accessibility and audience engagement. Facilitates the conversion of spoken language into written form for improved reach.
302_file_parser_mcp
→
The File Parser MCP Server helps you read, modify, and manage files easily. It simplifies the process of file handling, allowing developers to focus on building their applications without getting bogged down in the complexities of dealing with different file types.
divide-and-conquer-mcp-server
→
Breaks down complex tasks into manageable pieces using a structured JSON format, tracks progress, and maintains context across multiple conversations.
semanticscholar-MCP-Server
→
Search for academic papers, retrieve detailed information about specific papers and authors, and access citations and references through the Semantic Scholar API.
pdf-reader-mcp
→
Enables secure reading and extraction of text, metadata, and page counts from PDF files. Processes multiple PDFs from local paths or URLs with structured JSON output for easy parsing.
