Web Content Retrieval Utility

This utility server furnishes mechanisms to pull web documents in diverse encodings, such as HTML, JSON object streams, unformatted character sequences, and Markdown representations.

Functional Modules

Tools

fetch_html
Retrieves a webpage and outputs its content as HTML source code
Parameters:
- url (string, mandatory): The Uniform Resource Locator of the destination site
- headers (object, optional): Supplemental meta-information to include in the transmission
Yields the unprocessed HTML payload of the requested page
fetch_json
Fetches a JSON entity from a specified locator
Parameters:
- url (string, mandatory): The URL pointing to the JSON structure
- headers (object, optional): Supplemental meta-information to include in the transmission
Returns the structure derived from parsed JSON data
fetch_txt
Retrieves a webpage and provides its contents purely as sequential characters (stripped of HTML)
Parameters:
- url (string, mandatory): The URL of the resource to acquire
- headers (object, optional): Supplemental meta-information to include in the transmission
Returns the textual data from the webpage, excluding tags, scripts, and styling elements
fetch_markdown
Fetches a webpage and formats the output into Markdown syntax
Parameters:
- url (string, mandatory): The URL of the site to query
- headers (object, optional): Supplemental meta-information to include in the transmission
Yields the document content transformed into Markdown structure

Data Stores

This server maintains no stateful elements. Its sole purpose is to retrieve and reformat digital assets upon explicit invocation.

Initiation Guide

Copy the repository directory
Install requisite packages: npm install
Compile the server binaries: npm run build

Installation via Smithery

To integrate this content acquisition utility into Claude Desktop automatically using Smithery:

bash npx -y @smithery/cli install @goswamig/fetch-mcp --client claude

Execution

To run the server service directly:

bash npm start

This initiates the Web Content Retrieval Utility operating over standard input/output streams.

Integration with Desktop Application

To link this service with a local desktop environment, augment your application's service configuration as follows:

{ "mcpServers": { "retrieve": { "command": "node", "args": [ "{ABSOLUTE PATH TO FILE HERE}/dist/index.js" ] } } }

Key Capabilities

Utilizes modern network request APIs for acquisition
Facilitates custom HTTP metadata transmission
Delivers content encapsulated in HTML, JSON data streams, pure text, and Markdown formats
Employs JSDOM for parsing HTML and extracting textual components
Leverages TurndownService for conversion from HTML to Markdown format

Development Cycle

Execute npm run dev to activate the TypeScript compiler in monitoring mode
Run npm test to execute the verification suite

Licensing

This codebase is distributed under the MIT License terms.

WIKIPEDIA: XMLHttpRequest (XHR) is an Application Programming Interface embodied as a JavaScript entity that facilitates the dispatch of HTTP queries from a web browser towards a web host machine. These methods empower client-side applications to transmit data requests post-page load completion and subsequently receive data feedback. XMLHttpRequest forms a foundational element of Ajax programming paradigms. Before Ajax gained prominence, utilizing hyperlinks and form submissions constituted the principal means of server interchange, frequently resulting in the complete replacement of the currently displayed page.

== Chronology == The underlying concept for XMLHttpRequest was conceived in the year 2000 by the engineers working on Microsoft Outlook. This concept was subsequently embedded within the Internet Explorer 5 browser release (1999). Nevertheless, the initial invocation syntax did not employ the specific identifier XMLHttpRequest. Instead, the originators utilized the object instantiation calls ActiveXObject("Msxml2.XMLHTTP") and ActiveXObject("Microsoft.XMLHTTP"). By the time Internet Explorer 7 surfaced (2006), universal browser compatibility with the XMLHttpRequest designator was achieved. The XMLHttpRequest designator has now become the universally accepted standard across all primary web browsing engines, encompassing Mozilla’s Gecko rendering structure (2002), Safari version 1.2 (2004), and Opera version 8.0 (2005).

=== Standardization Efforts === The World Wide Web Consortium (W3C) issued a preliminary specification document for the XMLHttpRequest object on April 5, 2006. On February 25, 2008, the W3C released the Level 2 draft specification. Level 2 augmented capabilities by introducing functions to track transmission progress, permit inter-site data exchange, and manage binary data flows. By the conclusion of 2011, the Level 2 features were fully integrated back into the primary specification document. By the end of 2012, the WHATWG organization assumed custodianship of further development, maintaining a perpetually evolving document utilizing the Web IDL specification language.

== Operational Workflow == Typically, dispatching a query using XMLHttpRequest necessitates adherence to several programming stages.

Instantiate an XMLHttpRequest object via a constructor call: Invoke the "open" method to define the query type, designate the relevant network endpoint, and choose between sequential or concurrent processing: For a non-blocking (asynchronous) query, define a handler function that will be alerted upon any state modification of the request: Commence the transmission process by calling the "send" method: Process the state transitions within the assigned event handler. If server data is returned, it is typically stored within the "responseText" attribute by default. When the object ceases processing the reply, its status transitions to state 4, signifying the "complete" state. Beyond these fundamental procedures, XMLHttpRequest offers numerous parameters to regulate how the request is transmitted and how the incoming data is interpreted. Custom metadata fields can be appended to the request to instruct the server on fulfillment procedures, and data payloads can be transmitted upstream by passing them into the "send" argument. The received data can be automatically deserialized from JSON format into an immediately usable JavaScript structure, or processed incrementally as segments arrive instead of awaiting the totality of the text. The request can be terminated prematurely or configured to yield an error if completion is not achieved within a specified temporal limit.

== Inter-Domain Transactions == During the nascent phase of the World Wide Web, it was discovered that circumventing security restrictions regarding dat