logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

WebNavigator-MCP

A server component enabling programmatic control over web browsers via structured accessibility data streams. It facilitates sophisticated web interaction, data retrieval, and functional validation for automated agents, entirely divorcing execution from visual processing requirements.

Author

WebNavigator-MCP logo

markbustamante77

Apache License 2.0

Quick Info

GitHub GitHub Stars 1
NPM Weekly Downloads 17624
Tools 1
Last Updated 2026-02-19

Tags

scrapingautomationbrowserbrowser automationautomation webweb navigation

Playwright Model Context Protocol (MCP) Server

This implementation serves as an MCP backend leveraging the robust capabilities of Playwright. It offers a deterministic mechanism for language models to interface with dynamic web content by utilizing machine-readable accessibility tree representations instead of relying on visual inputs (screenshots) or vision-centric models.

Core Advantages

  • Performance Optimized: Relies on the accessibility tree structure, leading to faster and more efficient operations than pixel-based methods.
  • Vision-Agnostic: Operates purely on semantic, structured data, making it ideal for non-visual AI agents.
  • Predictable Execution: Ensures consistent and unambiguous command application across different web states.

Primary Applications

  • Executing complex web workflows such as form submission and site traversal.
  • Extracting deeply nested or contextually important information.
  • Facilitating end-to-end verification of web application functionality.

Configuration Snippet (Agent Setup)

js { "mcpServers": { "web_navigator": { "command": "npx", "args": [ "@playwright/mcp@latest" ] } } }

Deployment Integration in IDE Environments

To integrate this automation service directly within your workspace (e.g., VS Code):

Install in VS Code

Alternatively, command-line registration:

bash code --add-mcp '{"name":"web_navigator","command":"npx","args":["@playwright/mcp@latest"]}'

Server Runtime Customization

The Playwright MCP service supports various launch arguments to fine-tune its operation:

  • --browser <engine>: Specifies the rendering engine. Options include chrome, firefox, webkit, or specific channels (e.g., chrome-canary). Default is Chromium.
  • --caps <features>: A delimited list specifying enabled capabilities (e.g., tabs,pdf).
  • --headless: Executes the browser instance without a visible graphical interface.
  • --port <socket>: Defines the network port used for Server-Sent Events (SSE) communication.
  • --vision: Activates the secondary mode reliant on visual perception (screenshots) instead of accessibility structure.

User Profile Persistence

The automation environment utilizes a dedicated, isolated browser profile location:

  • Windows: %USERPROFILE%\AppData\Local\ms-playwright\mcp-chrome-profile
  • macOS: ~/Library/Caches/ms-playwright/mcp-chrome-profile
  • Linux: ~/.cache/ms-playwright/mcp-chrome-profile

Stateful session data (like cookies/logins) is retained here unless the directory is manually cleared.

Headless Execution Example

To enforce background operation without a GUI:

js { "mcpServers": { "web_navigator": { "command": "npx", "args": [ "@playwright/mcp@latest", "--headless" ] } } }

API Operations (Snapshot Mode - Default)

These functions target elements identified via the accessibility hierarchy:

  • browser_click: Executes a primary interaction on a targeted component.
  • browser_type: Inputs sequential string data into an input field.
  • browser_select_option: Manipulates the selection state of a <select> element.
  • browser_snapshot: Generates the current DOM structure augmented with accessibility properties (preferred over screenshots for actions).

API Operations (Vision Mode - Screenshot Dependent)

Activated via the --vision flag, these operations rely on screen coordinates:

  • browser_screen_move_mouse: Translates the cursor to specific (X, Y) screen coordinates.
  • browser_screen_click: Triggers a mouse click at a defined screen location.
  • browser_screen_type: Enters text, typically managed by sending keyboard events to the focused coordinate area.

See Also

`