WebNavigator MCP Endpoint & CLI

Foundation Note: This execution component builds upon the architecture established by the [browser-use/web-ui] project, adapting its core automation protocols and configuration schemes.

An implementation of the Model Context Protocol (MCP) centered around autonomous browser manipulation, driven by natural language prompts. It exposes server capabilities and direct command-line tooling.

Core Capabilities

🧠 Protocol Adherence - Complete integration with the Model Context Protocol for Agent communication.
🌐 Browser Execution - Orchestrates navigation, user input submission, and interactive element manipulation via plain language directives (via the execute_browser_task tool).
👁️ Visual Context - Enables optional analysis of rendered page state (screenshots) for multimodal Language Models.
🔄 Session Management - Supports persistent browser contexts spanning multiple requests or direct attachment to user-controlled browser sessions.
🔌 Provider Agnostic - Seamless interoperability with a wide spectrum of Large Language Model APIs, including OpenAI, Anthropic, Google, Ollama, and others.
🔍 In-Depth Synthesis - A specialized function (initiate_web_synthesis) for multi-stage research synthesis and structured output generation.
⚙️ Configuration Flexibility - Fully parameterizable through environment variables, managed via an underlying Pydantic schema.
🔗 CDP Interface - Capability to bind to an externally launched Chromium instance utilizing the Chrome DevTools Protocol endpoint.
⌨️ CLI Accessibility - Direct access to primary automation functions (execute_browser_task, initiate_web_synthesis) from the terminal for scripting validation.

Initial Setup

Prerequisites

Install UV (Python environment manager): curl -LsSf https://astral.sh/uv/install.sh | sh
Acquire necessary browser binaries via Playwright: uvx --from mcp-web-navigator@latest python -m playwright install

Integration Schema

For MCP client applications (like desktop assistants), establish the server connection via a configuration snippet, such as:

// Configuration Snippet A: Minimal Deployment "mcpServers": { "web-navigator": { "command": "uvx", "args": ["mcp-web-navigator@latest"], "env": { "NAV_LLM_GOOGLE_API_KEY": "YOUR_KEY_HERE_IF_USING_GOOGLE", "NAV_LLM_PROVIDER": "google", "NAV_LLM_MODEL": "gemini-2.5-flash-preview-04-17", "NAV_BROWSER_IS_HEADLESS": "true", } } }

(Further detailed configuration examples demonstrating CDP integration and extensive environment variable overrides are present in the original project documentation, mirroring the complexity shown in the source repository's README.)

Configuration Insight: Begin with the simplest setup (Snippet A). The comprehensive list of all adjustable parameters resides in the companion .env.example manifest.

Exposed MCP Functionality

This service publishes the following functions via the Model Context Protocol:

Synchronous Operations (Blocking Call)

execute_browser_task
- Definition: Carries out a web interaction sequence dictated by natural language input, awaiting final status. Configuration draws from NAV_AGENT_TOOL_*, NAV_LLM_*, and NAV_BROWSER_* prefixes.
- Inputs:
  - instruction_set (text, mandatory): The objective or command sequence.
- Output: (text) The conclusive result obtained by the agent, or an error report. History data (JSON, optional short video) is archived if NAV_AGENT_TOOL_HISTORY_DIR is specified.
initiate_web_synthesis
- Definition: Executes comprehensive, multi-step web exploration on a nominated subject, resulting in a synthesized report. Configuration uses NAV_RESEARCH_TOOL_*, NAV_LLM_*, and NAV_BROWSER_* settings. Outputs are directed to a task-specific subdirectory under NAV_RESEARCH_TOOL_OUTPUT_ROOT if defined; otherwise, processing remains ephemeral (in-memory).
- Inputs:
  - research_topic (text, mandatory): The subject matter for deep analysis.
  - max_concurrent_sessions (number, optional): Overrides the default defined in environment variables.
- Output: (text) The final research document rendered in Markdown format, including any persistent file reference, or an execution failure notice.

Terminal Interface (`web-navigator-cli`)

This package concurrently installs a utility, web-navigator-cli, enabling direct invocation of core logic outside the MCP server context.

Global Modifiers: * --config-file PATH, -c PATH: Specify a file containing environment overrides. * --verbosity LEVEL, -v LEVEL: Adjust runtime reporting level (e.g., TRACE, INFO).

Commands:

web-navigator-cli perform-action [PARAMETERS] INSTRUCTION
- Purpose: Executes a single browser interaction procedure.
- Argument:
  - INSTRUCTION (text, required): The action the agent must take.
- Example: bash web-navigator-cli perform-action "Load the documentation page and extract all H2 headings." -c .env.config
web-navigator-cli synthesize-data [PARAMETERS] TOPIC
- Purpose: Initiates background web synthesis.
- Argument:
  - TOPIC (text, required): The subject to research.
- Options:
  - --concurrency INTEGER, -x INTEGER: Sets the parallelism limit for this run.
- Example: bash web-navigator-cli synthesize-data "Recent developments in quantum computing architectures." --concurrency 5 -c .env.config

All auxiliary settings (API credentials, path mappings, browser rendering properties) are sourced from environment variables or the specified configuration file, as detailed in the Configuration Reference section below.

Configuration Reference (Environment Variables)

Configuration employs prefixed environment variables for logical grouping.

Variable Group (Prefix)	Example Variable	Functionality Summary	Default Setting
Core LLM (`NAV_LLM_`)		Parameters governing the principal reasoning engine.
	`NAV_LLM_PROVIDER`	The remote LLM service provider.	`openai`
	`NAV_LLM_MODEL`	The designated model identifier for inference.	`gpt-4.1`
	`NAV_LLM_TEMPERATURE`	Stochasticity control (0.0 to 2.0).	`0.0`
Browser Control (`NAV_BROWSER_`)		Settings affecting the underlying browser engine (Playwright/CDP).
	`NAV_BROWSER_IS_HEADLESS`	Execute browser operations invisibly.	`false`
	`NAV_BROWSER_CDP_TARGET`	URL for Chrome DevTools Protocol attachment point (for external browser linkage).	-
	`NAV_BROWSER_VIEWPORT_W`	Initial rendering width in pixels.	`1280`
	`NAV_BROWSER_PERSIST_SESSION`	Maintain the browser instance state between separate server requests.	`false`
Agent Task (`NAV_AGENT_TOOL_`)		Fine-tuning parameters for the immediate task execution tool.
	`NAV_AGENT_TOOL_MAX_CYCLES`	Upper bound on decision/action loops per run.	`100`
	`NAV_AGENT_TOOL_ENABLE_VISUAL_INPUT`	Activates analysis of screen captures by the LLM.	`true`
	`NAV_AGENT_TOOL_HISTORY_ARCHIVE_DIR`	Location to persistently store execution logs and trace data.	(Disabled)
Synthesis (`NAV_RESEARCH_TOOL_`)		Parameters for the deep research function.
	`NAV_RESEARCH_TOOL_OUTPUT_ROOT`	Root directory for saving final reports. If null, results are returned solely in memory.	`None`
System Paths (`NAV_PATHS_`)		Global file system locations managed by the service.
	`NAV_PATHS_ARTIFACT_CACHE`	Designated directory for temporary file storage or downloads.	(Disabled)
Service Operation (`NAV_SERVER_`)		Settings related to server process management and reporting.
	`NAV_SERVER_VERBOSITY_LEVEL`	Detail level for internal logging output.	`ERROR`

Enumerated LLM Providers: openai, azure, anthropic, google, mistral, ollama, deepseek, openrouter, alibaba, moonshot, unbound

External Browser Binding (CDP Mode)

To detach from server-managed browser instances and instead interface with a user-initiated session:

Start Chromium: Execute Chrome/Chromium with the remote debugging flag: (<executable> --remote-debugging-port=9222)
Configure Environment: Set variables to point to this running instance: dotenv NAV_BROWSER_USE_OWN_SESSION=true NAV_BROWSER_CDP_TARGET=http://localhost:9222
Initiate Service: Launch the server or CLI as typical.

Caveat: When NAV_BROWSER_USE_OWN_SESSION=true, parameters controlling server-managed browser behavior (like headless mode or keep-alive) are disregarded.

Development Workflow

bash

Initialize dependencies

uv sync --dev

Install Playwright dependencies

uv run playwright install

Example of running the inspector tool using environment variables for an external CDP session:

npx @modelcontextprotocol/inspector@latest \ -e NAV_LLM_GOOGLE_API_KEY=$GOOGLE_API_KEY \ -e NAV_LLM_PROVIDER=google \ -e NAV_BROWSER_USE_OWN_SESSION=true \ -e NAV_BROWSER_CDP_TARGET=http://localhost:9222 \ uv --directory . run mcp-web-navigator

Execute a CLI agent test (ensure a config file or environment variables are present)

uv run web-navigator-cli perform-action "Verify the primary H1 tag on https://www.wikipedia.org/" -c .env.config

Diagnostics and Error Resolution

Startup Failure (Missing Parameter): Verify all system requirements dictated by the active configuration schema are satisfied, particularly mandatory paths for research output (NAV_RESEARCH_TOOL_OUTPUT_ROOT).
CDP Binding Failures: Confirm that the browser executable was initiated with the necessary --remote-debugging-port flag active and accessible at the specified NAV_BROWSER_CDP_TARGET address.
API Key Errors: Revalidate the credentials stored in the environment variables (e.g., NAV_LLM_OPENAI_API_KEY).
Data Persistence Failure: If history logging, tracing, or file downloads are not recorded, ensure the corresponding path variables are set and the execution context possesses the necessary write permissions for those directories.
Troubleshooting Logging: Increase the service logging verbosity via NAV_SERVER_VERBOSITY_LEVEL to TRACE for granular insights.

Licensing

Distributed under the terms of the MIT License. See the [LICENSE] file for specifics.

WIKIPEDIA CONTEXT: Headless web browsing facilitates automated interaction with web interfaces without rendering a graphical display, primarily serving robust applications in software validation, web content capture, and systematic data harvesting. This paradigm has largely superseded older, non-standardized automation methods due to native support for headless modes emerging in contemporary browser engines like Chromium and Firefox via the WebDriver standard.

mcp-web-navigator

Author

Saik0s

Quick Info

Actions

Tags

WebNavigator MCP Endpoint & CLI

Core Capabilities