MCP Web Discovery Engine Documentation

A Model Context Protocol (MCP) service tailored for dynamic, real-time information acquisition from the World Wide Web.

Embed immediate, current data streams directly into your Claude interactions for comprehensive topic exploration.

Core Functionality

Seamless interface with Google search infrastructure.
Facility for retrieving and synthesizing content from arbitrary URLs.
Persistent logging of the research trajectory (queries executed, pages accessed, etc.).
Mechanism for capturing visual representations (screenshots) of viewed content.

System Requirements

A foundational installation of Node.js (version 18 or newer, which incorporates npm and npx).
The requisite Claude Desktop application must be operational.

Initial Setup Procedure

Verify that the Claude Desktop client is installed and that you possess an active npm environment.

Subsequently, integrate the following configuration stanza into your claude_desktop_config.json file (Path example for macOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json):

{ "mcpServers": { "webdiscovery": { "command": "npx", "args": ["-y", "@mzxrai/mcp-webresearch@latest"] } } }

This registration enables the Claude Desktop environment to dynamically launch this information retrieval service when its functionality is invoked.

Operational Guidance

Initiate a dialogue session with Claude and submit requests necessitating current external knowledge. For structured, deep-dive investigations, we offer the pre-configured agentic-research interaction pattern. Access this template in Claude Desktop by navigating the attachment menu: Select the Paperclip icon, then choose Choose an integration → webdiscovery → agentic-research.

Example screenshot of web research

Exposed Interfaces (Tools)

search_google
Purpose: Executes Google lookups and returns structured results.
Parameters: { query: string }
visit_page
Purpose: Navigates to a specified URL and extracts the document body.
Parameters: { url: string, takeScreenshot?: boolean }
take_screenshot
Purpose: Captures the visual state of the actively viewed page.
Arguments: None required.

Predefined Interaction Scripts (Prompts)

`agentic-research`

An orchestrated inquiry sequence designed to facilitate robust web investigations by Claude. This script guides Claude to: - Commence with broad searches to map the informational landscape. - Prioritize data acquisition from vetted, high-authority sources. - Progressively refine investigative avenues based on emergent findings. - Maintain transparency by providing progress updates and soliciting interactive guidance. - Mandate the inclusion of source URLs for all referenced information.

Data Artifacts (MCP Resources)

Two primary data entities are managed by this service for MCP access: (1) Visual captures of web content, and (2) The comprehensive session record.

Visual Snapshots

Screenshots captured via the tool are formalized as MCP resources. These artifacts are retrievable within the Claude Desktop interface via the Paperclip icon menu.

Research Session Ledger

This server sustains an immutable log documenting the entire investigation process, including: - All executed search strings. - All traversed web addresses. - Parsed page narratives. - Associated visual captures. - Temporal markers.

Operational Recommendations

To optimize investigative outcomes, especially if deviating from the automated agentic-research script, consider explicitly naming high-caliber data origins in your queries. For instance, use constructs like current events from reuters or AP rather than vague requests such as current events today.

Caveats

This implementation is currently in a preliminary, pre-alpha operational state and leverages Artificial Intelligence Generation Capabilities (AIGC); thus, instability and unexpected behavior should be anticipated.

If operational anomalies arise, examining the diagnostic output from Claude Desktop can be beneficial:

bash tail -n 20 -f ~/Library/Logs/Claude/mcp*.log

Development Workflow

bash

Dependency installation

pnpm install

Project compilation

pnpm build

Continuous monitoring for source changes

pnpm watch

Launch in debugging mode

pnpm dev

Prerequisites Revisited

Runtime Environment: Node.js version 18 or higher.
Browser Automation Library: Playwright (installed automatically).

Supported Environments

[x] macOS
[ ] Linux

Licensing

This code is distributed under the terms of the MIT License.

Maintainer

mzxrai

[Continuation of Wikipedia text on XMLHttpRequest as context filler, ensuring meaning preservation but lexical difference...]

== Cross-domain requests ==

In the nascent stages of the World Wide Web, constraints were identified that restricted scripts running on one domain from interacting with resources hosted on a different domain. This security measure, known as the Same-Origin Policy, profoundly influenced early web interactivity. XMLHttpRequest, in its initial form, was heavily constrained by this policy, severely limiting its utility for fetching data from external domains. The advent of mechanisms like Access-Control-Allow-Origin headers and later iterations of the XMLHttpRequest specification (specifically Level 2) gradually introduced standardized, secure methods for transcending these cross-origin barriers, vastly expanding the scope of asynchronous web data retrieval.

mcp-web-discovery-engine

Author