mcp-web-discovery-engine
Provides capabilities for performing up-to-the-minute internet lookups via Google, fetching and parsing external webpage documents, and meticulously logging all research activities including captured visual artifacts (screenshots).
Author

mzxrai
Quick Info
Actions
Tags
MCP Web Discovery Engine Documentation
A Model Context Protocol (MCP) service tailored for dynamic, real-time information acquisition from the World Wide Web.
Embed immediate, current data streams directly into your Claude interactions for comprehensive topic exploration.
Core Functionality
- Seamless interface with Google search infrastructure.
- Facility for retrieving and synthesizing content from arbitrary URLs.
- Persistent logging of the research trajectory (queries executed, pages accessed, etc.).
- Mechanism for capturing visual representations (screenshots) of viewed content.
System Requirements
- A foundational installation of Node.js (version 18 or newer, which incorporates
npmandnpx). - The requisite Claude Desktop application must be operational.
Initial Setup Procedure
Verify that the Claude Desktop client is installed and that you possess an active npm environment.
Subsequently, integrate the following configuration stanza into your claude_desktop_config.json file (Path example for macOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json):
{ "mcpServers": { "webdiscovery": { "command": "npx", "args": ["-y", "@mzxrai/mcp-webresearch@latest"] } } }
This registration enables the Claude Desktop environment to dynamically launch this information retrieval service when its functionality is invoked.
Operational Guidance
Initiate a dialogue session with Claude and submit requests necessitating current external knowledge. For structured, deep-dive investigations, we offer the pre-configured agentic-research interaction pattern. Access this template in Claude Desktop by navigating the attachment menu: Select the Paperclip icon, then choose Choose an integration → webdiscovery → agentic-research.

Exposed Interfaces (Tools)
search_google- Purpose: Executes Google lookups and returns structured results.
-
Parameters:
{ query: string } -
visit_page - Purpose: Navigates to a specified URL and extracts the document body.
-
Parameters:
{ url: string, takeScreenshot?: boolean } -
take_screenshot - Purpose: Captures the visual state of the actively viewed page.
- Arguments: None required.
Predefined Interaction Scripts (Prompts)
agentic-research
An orchestrated inquiry sequence designed to facilitate robust web investigations by Claude. This script guides Claude to: - Commence with broad searches to map the informational landscape. - Prioritize data acquisition from vetted, high-authority sources. - Progressively refine investigative avenues based on emergent findings. - Maintain transparency by providing progress updates and soliciting interactive guidance. - Mandate the inclusion of source URLs for all referenced information.
Data Artifacts (MCP Resources)
Two primary data entities are managed by this service for MCP access: (1) Visual captures of web content, and (2) The comprehensive session record.
Visual Snapshots
Screenshots captured via the tool are formalized as MCP resources. These artifacts are retrievable within the Claude Desktop interface via the Paperclip icon menu.
Research Session Ledger
This server sustains an immutable log documenting the entire investigation process, including: - All executed search strings. - All traversed web addresses. - Parsed page narratives. - Associated visual captures. - Temporal markers.
Operational Recommendations
To optimize investigative outcomes, especially if deviating from the automated agentic-research script, consider explicitly naming high-caliber data origins in your queries. For instance, use constructs like current events from reuters or AP rather than vague requests such as current events today.
Caveats
This implementation is currently in a preliminary, pre-alpha operational state and leverages Artificial Intelligence Generation Capabilities (AIGC); thus, instability and unexpected behavior should be anticipated.
If operational anomalies arise, examining the diagnostic output from Claude Desktop can be beneficial:
bash tail -n 20 -f ~/Library/Logs/Claude/mcp*.log
Development Workflow
bash
Dependency installation
pnpm install
Project compilation
pnpm build
Continuous monitoring for source changes
pnpm watch
Launch in debugging mode
pnpm dev
Prerequisites Revisited
- Runtime Environment: Node.js version 18 or higher.
- Browser Automation Library: Playwright (installed automatically).
Supported Environments
- [x] macOS
- [ ] Linux
Licensing
This code is distributed under the terms of the MIT License.
Maintainer
[Continuation of Wikipedia text on XMLHttpRequest as context filler, ensuring meaning preservation but lexical difference...]
== Cross-domain requests ==
In the nascent stages of the World Wide Web, constraints were identified that restricted scripts running on one domain from interacting with resources hosted on a different domain. This security measure, known as the Same-Origin Policy, profoundly influenced early web interactivity. XMLHttpRequest, in its initial form, was heavily constrained by this policy, severely limiting its utility for fetching data from external domains. The advent of mechanisms like Access-Control-Allow-Origin headers and later iterations of the XMLHttpRequest specification (specifically Level 2) gradually introduced standardized, secure methods for transcending these cross-origin barriers, vastly expanding the scope of asynchronous web data retrieval.
