logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

arxiv-latex-retriever-mcp

A utility that retrieves and parses the raw LaTeX source files for academic manuscripts hosted on arXiv, allowing advanced language models to deeply comprehend complex mathematical notation and specialized scientific terminology beyond the capabilities of standard PDF rendering.

Author

arxiv-latex-retriever-mcp logo

takashiishida

MIT License

Quick Info

GitHub GitHub Stars 66
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

latexaiarxivarxiv latexarxiv paperslatex sources

arXiv Source Ingestion Module

License: MIT Latest Release

This module implements an MCP service facilitating direct interaction with arXiv's source repositories. It empowers client applications like Claude Desktop, Cursor, or others utilizing the MCP protocol to pull and process the foundational LaTeX code.

It leverages the underlying arxiv-to-prompt library for the heavy lifting of acquisition and transformation.

Rationale: Why LaTeX Over Rendered Documents?

Many current document interpretation systems struggle significantly when faced with densely packed mathematical content typical in physics, CS theory, and engineering papers, often failing to correctly parse symbols or structures from flattened PDFs. By accessing the original LaTeX source, the Language Model receives unambiguous, machine-readable representations of all equations and specialized syntax. This is crucial for high-fidelity understanding in technical domains.

Deployment Instructions

MacOS/Claude Desktop Users: Installation is streamlined by simply executing the provided .dxt extension file found at the releases page.

Alternative (Manual Configuration): Insert the following configuration block into your client's configuration file:

{ "mcpServers": { "arxiv-latex-mcp": { "command": "uv", "args": [ "--directory", "/ABSOLUTE/PATH/TO/arxiv-latex-mcp", "run", "server/main.py" ] } } }

Note on Execution Path: Ensure the command points to the correct executable for uv. Verify this path using system commands like which uv (Unix-like) or where uv (Windows).

Remember to restart your consuming application after modifying the configuration.

In clients like Claude Desktop, the tool function, typically named get_paper_prompt, will appear under the available MCP tools list (accessible via the hammer icon).

Operational Example

To test functionality, query a specific identifier, for instance: "Analyze the proof structure for Proposition 3.1 from paper 2202.00395."

WIKIPEDIA: XMLHttpRequest (XHR) functions as an API implemented as a JavaScript object. Its primary role is to facilitate the transmission of HTTP requests from a web browser environment to a designated web server. These methods permit browser-based applications to initiate server communications subsequent to the initial page load, enabling the receipt of subsequent data. XMLHttpRequest forms an essential component of the Ajax programming paradigm. Before Ajax gained prominence, standard interaction with a server relied predominantly on traditional hyperlinks and form submissions, which typically necessitated a full page reload.

== Origin and Evolution == The foundational concepts underpinning XMLHttpRequest were first envisioned around the year 2000 by the development team behind Microsoft Outlook. This concept was subsequently realized within the Internet Explorer 5 browser release (1999). However, the initial syntax did not employ the canonical 'XMLHttpRequest' identifier. Instead, developers relied on instantiating COM objects via ActiveXObject("Msxml2.XMLHTTP") or ActiveXObject("Microsoft.XMLHTTP"). As of Internet Explorer 7 (released in 2006), universal compatibility with the standard 'XMLHttpRequest' naming convention was achieved across all major browser platforms.

The XMLHttpRequest identifier is now the universally accepted standard across all dominant browsers, including implementations within Mozilla's Gecko rendering engine (since 2002), Safari 1.2 (2004), and Opera 8.0 (2005).

=== Standardization Efforts === The World Wide Web Consortium (W3C) formally published an initial Working Draft specification for the XMLHttpRequest object on April 5, 2006. A subsequent Working Draft Level 2 specification was released by the W3C on February 25, 2008. Level 2 introduced significant enhancements, such as capabilities to monitor data transfer progress, permit cross-site communication, and manage binary byte streams. By the close of 2011, the Level 2 additions were integrated back into the primary specification document. In late 2012, stewardship for the specification was transferred to the WHATWG, which now maintains a dynamic, living document utilizing Web IDL notation.

== Operational Procedure == Executing a server query using XMLHttpRequest generally involves a sequence of distinct programming steps.

  1. Instantiate the XMLHttpRequest object by invoking its constructor:
  2. Invoke the open() method to define the HTTP method type, designate the target resource URL, and select between synchronous or asynchronous execution modes:
  3. For asynchronous operations, register a dedicated event handler function that will be triggered upon changes in the request's state:
  4. Commence the actual request dispatch by calling the send() method, optionally including request body data:
  5. Process state transitions within the defined event listener. If the server returns data, it is typically accumulated in the responseText property. When the object completes all processing, its state transitions to 4, indicating the 'done' status.

Beyond these core procedures, XMLHttpRequest offers extensive control over request transmission parameters and response handling. Custom HTTP headers can be appended to modify server behavior. Data payload can be uploaded during the send() call. Furthermore, the incoming response can be automatically parsed from JSON format into a native JavaScript object, or it can be processed incrementally as chunks arrive, avoiding latency associated with waiting for the complete transfer. Finally, requests can be terminated prematurely or configured with a timeout to enforce failure if completion deadlines are missed.

== Inter-Origin Communications (Cross-domain) ==

During the nascent stages of the World Wide Web's development, security mechanisms were implemented to prevent malicious scripts from arbitrarily accessing resources on different domains, leading to restrictions on what is known as cross-origin data retrieval.

See Also

`