logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

mcp-web-discovery-engine

Provides capabilities for performing up-to-the-minute internet lookups via Google, fetching and parsing external webpage documents, and meticulously logging all research activities including captured visual artifacts (screenshots).

Author

mcp-web-discovery-engine logo

mzxrai

MIT License

Quick Info

GitHub GitHub Stars 281
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

webresearchwebpagessearchmcp webresearchwebresearch integratesweb research

MCP Web Discovery Engine Documentation

A Model Context Protocol (MCP) service tailored for dynamic, real-time information acquisition from the World Wide Web.

Embed immediate, current data streams directly into your Claude interactions for comprehensive topic exploration.

Core Functionality

  • Seamless interface with Google search infrastructure.
  • Facility for retrieving and synthesizing content from arbitrary URLs.
  • Persistent logging of the research trajectory (queries executed, pages accessed, etc.).
  • Mechanism for capturing visual representations (screenshots) of viewed content.

System Requirements

Initial Setup Procedure

Verify that the Claude Desktop client is installed and that you possess an active npm environment.

Subsequently, integrate the following configuration stanza into your claude_desktop_config.json file (Path example for macOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json):

{ "mcpServers": { "webdiscovery": { "command": "npx", "args": ["-y", "@mzxrai/mcp-webresearch@latest"] } } }

This registration enables the Claude Desktop environment to dynamically launch this information retrieval service when its functionality is invoked.

Operational Guidance

Initiate a dialogue session with Claude and submit requests necessitating current external knowledge. For structured, deep-dive investigations, we offer the pre-configured agentic-research interaction pattern. Access this template in Claude Desktop by navigating the attachment menu: Select the Paperclip icon, then choose Choose an integrationwebdiscoveryagentic-research.

Example screenshot of web research

Exposed Interfaces (Tools)

  1. search_google
  2. Purpose: Executes Google lookups and returns structured results.
  3. Parameters: { query: string }

  4. visit_page

  5. Purpose: Navigates to a specified URL and extracts the document body.
  6. Parameters: { url: string, takeScreenshot?: boolean }

  7. take_screenshot

  8. Purpose: Captures the visual state of the actively viewed page.
  9. Arguments: None required.

Predefined Interaction Scripts (Prompts)

agentic-research

An orchestrated inquiry sequence designed to facilitate robust web investigations by Claude. This script guides Claude to: - Commence with broad searches to map the informational landscape. - Prioritize data acquisition from vetted, high-authority sources. - Progressively refine investigative avenues based on emergent findings. - Maintain transparency by providing progress updates and soliciting interactive guidance. - Mandate the inclusion of source URLs for all referenced information.

Data Artifacts (MCP Resources)

Two primary data entities are managed by this service for MCP access: (1) Visual captures of web content, and (2) The comprehensive session record.

Visual Snapshots

Screenshots captured via the tool are formalized as MCP resources. These artifacts are retrievable within the Claude Desktop interface via the Paperclip icon menu.

Research Session Ledger

This server sustains an immutable log documenting the entire investigation process, including: - All executed search strings. - All traversed web addresses. - Parsed page narratives. - Associated visual captures. - Temporal markers.

Operational Recommendations

To optimize investigative outcomes, especially if deviating from the automated agentic-research script, consider explicitly naming high-caliber data origins in your queries. For instance, use constructs like current events from reuters or AP rather than vague requests such as current events today.

Caveats

This implementation is currently in a preliminary, pre-alpha operational state and leverages Artificial Intelligence Generation Capabilities (AIGC); thus, instability and unexpected behavior should be anticipated.

If operational anomalies arise, examining the diagnostic output from Claude Desktop can be beneficial:

bash tail -n 20 -f ~/Library/Logs/Claude/mcp*.log

Development Workflow

bash

Dependency installation

pnpm install

Project compilation

pnpm build

Continuous monitoring for source changes

pnpm watch

Launch in debugging mode

pnpm dev

Prerequisites Revisited

  • Runtime Environment: Node.js version 18 or higher.
  • Browser Automation Library: Playwright (installed automatically).

Supported Environments

  • [x] macOS
  • [ ] Linux

Licensing

This code is distributed under the terms of the MIT License.

Maintainer

mzxrai

[Continuation of Wikipedia text on XMLHttpRequest as context filler, ensuring meaning preservation but lexical difference...]

== Cross-domain requests ==

In the nascent stages of the World Wide Web, constraints were identified that restricted scripts running on one domain from interacting with resources hosted on a different domain. This security measure, known as the Same-Origin Policy, profoundly influenced early web interactivity. XMLHttpRequest, in its initial form, was heavily constrained by this policy, severely limiting its utility for fetching data from external domains. The advent of mechanisms like Access-Control-Allow-Origin headers and later iterations of the XMLHttpRequest specification (specifically Level 2) gradually introduced standardized, secure methods for transcending these cross-origin barriers, vastly expanding the scope of asynchronous web data retrieval.

See Also

`