logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

mcp-web-navigator

Server and command-line utility for AI-orchestrated web interaction, facilitating programmatic control over digital browser environments for testing, data acquisition, and digital research workflows.

Author

mcp-web-navigator logo

Saik0s

MIT License

Quick Info

GitHub GitHub Stars 819
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

browserbrowsersautomationbrowser automationautomation webweb browsers


WebNavigator MCP Endpoint & CLI

Server Status License Type

Foundation Note: This execution component builds upon the architecture established by the [browser-use/web-ui] project, adapting its core automation protocols and configuration schemes.

An implementation of the Model Context Protocol (MCP) centered around autonomous browser manipulation, driven by natural language prompts. It exposes server capabilities and direct command-line tooling.

Navigator MCP Service

Core Capabilities

  • 🧠 Protocol Adherence - Complete integration with the Model Context Protocol for Agent communication.
  • 🌐 Browser Execution - Orchestrates navigation, user input submission, and interactive element manipulation via plain language directives (via the execute_browser_task tool).
  • 👁️ Visual Context - Enables optional analysis of rendered page state (screenshots) for multimodal Language Models.
  • 🔄 Session Management - Supports persistent browser contexts spanning multiple requests or direct attachment to user-controlled browser sessions.
  • 🔌 Provider Agnostic - Seamless interoperability with a wide spectrum of Large Language Model APIs, including OpenAI, Anthropic, Google, Ollama, and others.
  • 🔍 In-Depth Synthesis - A specialized function (initiate_web_synthesis) for multi-stage research synthesis and structured output generation.
  • ⚙️ Configuration Flexibility - Fully parameterizable through environment variables, managed via an underlying Pydantic schema.
  • 🔗 CDP Interface - Capability to bind to an externally launched Chromium instance utilizing the Chrome DevTools Protocol endpoint.
  • ⌨️ CLI Accessibility - Direct access to primary automation functions (execute_browser_task, initiate_web_synthesis) from the terminal for scripting validation.

Initial Setup

Prerequisites

  1. Install UV (Python environment manager): curl -LsSf https://astral.sh/uv/install.sh | sh

  2. Acquire necessary browser binaries via Playwright: uvx --from mcp-web-navigator@latest python -m playwright install

Integration Schema

For MCP client applications (like desktop assistants), establish the server connection via a configuration snippet, such as:

// Configuration Snippet A: Minimal Deployment "mcpServers": { "web-navigator": { "command": "uvx", "args": ["mcp-web-navigator@latest"], "env": { "NAV_LLM_GOOGLE_API_KEY": "YOUR_KEY_HERE_IF_USING_GOOGLE", "NAV_LLM_PROVIDER": "google", "NAV_LLM_MODEL": "gemini-2.5-flash-preview-04-17", "NAV_BROWSER_IS_HEADLESS": "true", } } }

(Further detailed configuration examples demonstrating CDP integration and extensive environment variable overrides are present in the original project documentation, mirroring the complexity shown in the source repository's README.)

Configuration Insight: Begin with the simplest setup (Snippet A). The comprehensive list of all adjustable parameters resides in the companion .env.example manifest.

Exposed MCP Functionality

This service publishes the following functions via the Model Context Protocol:

Synchronous Operations (Blocking Call)

  1. execute_browser_task

    • Definition: Carries out a web interaction sequence dictated by natural language input, awaiting final status. Configuration draws from NAV_AGENT_TOOL_*, NAV_LLM_*, and NAV_BROWSER_* prefixes.
    • Inputs:
      • instruction_set (text, mandatory): The objective or command sequence.
    • Output: (text) The conclusive result obtained by the agent, or an error report. History data (JSON, optional short video) is archived if NAV_AGENT_TOOL_HISTORY_DIR is specified.
  2. initiate_web_synthesis

    • Definition: Executes comprehensive, multi-step web exploration on a nominated subject, resulting in a synthesized report. Configuration uses NAV_RESEARCH_TOOL_*, NAV_LLM_*, and NAV_BROWSER_* settings. Outputs are directed to a task-specific subdirectory under NAV_RESEARCH_TOOL_OUTPUT_ROOT if defined; otherwise, processing remains ephemeral (in-memory).
    • Inputs:
      • research_topic (text, mandatory): The subject matter for deep analysis.
      • max_concurrent_sessions (number, optional): Overrides the default defined in environment variables.
    • Output: (text) The final research document rendered in Markdown format, including any persistent file reference, or an execution failure notice.

Terminal Interface (web-navigator-cli)

This package concurrently installs a utility, web-navigator-cli, enabling direct invocation of core logic outside the MCP server context.

Global Modifiers: * --config-file PATH, -c PATH: Specify a file containing environment overrides. * --verbosity LEVEL, -v LEVEL: Adjust runtime reporting level (e.g., TRACE, INFO).

Commands:

  1. web-navigator-cli perform-action [PARAMETERS] INSTRUCTION

    • Purpose: Executes a single browser interaction procedure.
    • Argument:
      • INSTRUCTION (text, required): The action the agent must take.
    • Example: bash web-navigator-cli perform-action "Load the documentation page and extract all H2 headings." -c .env.config
  2. web-navigator-cli synthesize-data [PARAMETERS] TOPIC

    • Purpose: Initiates background web synthesis.
    • Argument:
      • TOPIC (text, required): The subject to research.
    • Options:
      • --concurrency INTEGER, -x INTEGER: Sets the parallelism limit for this run.
    • Example: bash web-navigator-cli synthesize-data "Recent developments in quantum computing architectures." --concurrency 5 -c .env.config

All auxiliary settings (API credentials, path mappings, browser rendering properties) are sourced from environment variables or the specified configuration file, as detailed in the Configuration Reference section below.

Configuration Reference (Environment Variables)

Configuration employs prefixed environment variables for logical grouping.

Variable Group (Prefix) Example Variable Functionality Summary Default Setting
Core LLM (NAV_LLM_) Parameters governing the principal reasoning engine.
NAV_LLM_PROVIDER The remote LLM service provider. openai
NAV_LLM_MODEL The designated model identifier for inference. gpt-4.1
NAV_LLM_TEMPERATURE Stochasticity control (0.0 to 2.0). 0.0
Browser Control (NAV_BROWSER_) Settings affecting the underlying browser engine (Playwright/CDP).
NAV_BROWSER_IS_HEADLESS Execute browser operations invisibly. false
NAV_BROWSER_CDP_TARGET URL for Chrome DevTools Protocol attachment point (for external browser linkage). -
NAV_BROWSER_VIEWPORT_W Initial rendering width in pixels. 1280
NAV_BROWSER_PERSIST_SESSION Maintain the browser instance state between separate server requests. false
Agent Task (NAV_AGENT_TOOL_) Fine-tuning parameters for the immediate task execution tool.
NAV_AGENT_TOOL_MAX_CYCLES Upper bound on decision/action loops per run. 100
NAV_AGENT_TOOL_ENABLE_VISUAL_INPUT Activates analysis of screen captures by the LLM. true
NAV_AGENT_TOOL_HISTORY_ARCHIVE_DIR Location to persistently store execution logs and trace data. (Disabled)
Synthesis (NAV_RESEARCH_TOOL_) Parameters for the deep research function.
NAV_RESEARCH_TOOL_OUTPUT_ROOT Root directory for saving final reports. If null, results are returned solely in memory. None
System Paths (NAV_PATHS_) Global file system locations managed by the service.
NAV_PATHS_ARTIFACT_CACHE Designated directory for temporary file storage or downloads. (Disabled)
Service Operation (NAV_SERVER_) Settings related to server process management and reporting.
NAV_SERVER_VERBOSITY_LEVEL Detail level for internal logging output. ERROR

Enumerated LLM Providers: openai, azure, anthropic, google, mistral, ollama, deepseek, openrouter, alibaba, moonshot, unbound

External Browser Binding (CDP Mode)

To detach from server-managed browser instances and instead interface with a user-initiated session:

  1. Start Chromium: Execute Chrome/Chromium with the remote debugging flag: (<executable> --remote-debugging-port=9222)

  2. Configure Environment: Set variables to point to this running instance: dotenv NAV_BROWSER_USE_OWN_SESSION=true NAV_BROWSER_CDP_TARGET=http://localhost:9222

  3. Initiate Service: Launch the server or CLI as typical.

Caveat: When NAV_BROWSER_USE_OWN_SESSION=true, parameters controlling server-managed browser behavior (like headless mode or keep-alive) are disregarded.

Development Workflow

bash

Initialize dependencies

uv sync --dev

Install Playwright dependencies

uv run playwright install

Example of running the inspector tool using environment variables for an external CDP session:

npx @modelcontextprotocol/inspector@latest \ -e NAV_LLM_GOOGLE_API_KEY=$GOOGLE_API_KEY \ -e NAV_LLM_PROVIDER=google \ -e NAV_BROWSER_USE_OWN_SESSION=true \ -e NAV_BROWSER_CDP_TARGET=http://localhost:9222 \ uv --directory . run mcp-web-navigator

Execute a CLI agent test (ensure a config file or environment variables are present)

uv run web-navigator-cli perform-action "Verify the primary H1 tag on https://www.wikipedia.org/" -c .env.config

Diagnostics and Error Resolution

  • Startup Failure (Missing Parameter): Verify all system requirements dictated by the active configuration schema are satisfied, particularly mandatory paths for research output (NAV_RESEARCH_TOOL_OUTPUT_ROOT).
  • CDP Binding Failures: Confirm that the browser executable was initiated with the necessary --remote-debugging-port flag active and accessible at the specified NAV_BROWSER_CDP_TARGET address.
  • API Key Errors: Revalidate the credentials stored in the environment variables (e.g., NAV_LLM_OPENAI_API_KEY).
  • Data Persistence Failure: If history logging, tracing, or file downloads are not recorded, ensure the corresponding path variables are set and the execution context possesses the necessary write permissions for those directories.
  • Troubleshooting Logging: Increase the service logging verbosity via NAV_SERVER_VERBOSITY_LEVEL to TRACE for granular insights.

Licensing

Distributed under the terms of the MIT License. See the [LICENSE] file for specifics.

WIKIPEDIA CONTEXT: Headless web browsing facilitates automated interaction with web interfaces without rendering a graphical display, primarily serving robust applications in software validation, web content capture, and systematic data harvesting. This paradigm has largely superseded older, non-standardized automation methods due to native support for headless modes emerging in contemporary browser engines like Chromium and Firefox via the WebDriver standard.

See Also

`