logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

browser-interaction-toolkit-mcp

A suite of utilities leveraging a dedicated Chrome extension to capture browser telemetry and analyze page state, empowering AI agents with advanced contextual interaction capabilities.

Author

browser-interaction-toolkit-mcp logo

AgentDeskAI

MIT License

Quick Info

GitHub GitHub Stars 6656
NPM Weekly Downloads 3555
Tools 1
Last Updated 2026-02-19

Tags

browserautomationscrapingbrowser automationagentdeskai browserautomation web

Web Context Augmentation Platform (BrowserTools MCP)

Elevate your artificial intelligence assistants with 10x greater perception and control over the active web browsing session.

This sophisticated utility package functions as a comprehensive monitoring and operational conduit. It employs a specialized Chrome extension to intercept and interpret browser data streams, making this information accessible to AI applications via Anthropic's Model Context Protocol (MCP).

Consult our official documentation for comprehensive setup instructions, initial deployment guides, and participation details.

Development Trajectory

View the anticipated features and milestones on our project board: Github Roadmap / Project Board

Recent Enhancements (v1.2.0)

Version 1.2.0 introduces significant feature upgrades: - New setting in DevTools: "Allow Auto-Paste into Cursor," enabling automatic injection of captured screenshots into the active Agent input field within Cursor (ensure the input field has focus). - Incorporation of Lighthouse analysis modules for evaluating SEO, operational performance, accessibility compliance, and adherence to general web standards. - Introduction of a specialized prompt template designed to optimize SEO for NextJS applications. - Addition of 'Debugger Mode,' which executes a predefined sequence of diagnostic tools, accompanied by an enhanced reasoning prompt. - Addition of 'Audit Mode,' which runs all available auditing routines sequentially. - Remediation of network connectivity failures specific to Windows operating systems. - Substantial improvements to inter-component networking (Server, Extension, MCP), including automated host/port identification, persistent reconnection logic, and orderly termination procedures. - Implemented a simpler exit mechanism for the Browser Tools server via Ctrl+C.

Initial Setup Guide

Deploying this MCP utility requires launching three distinct elements:

  1. Acquire and install the Chrome extension package here: v1.2.0 BrowserToolsMCP Chrome Extension
  2. Initiate the MCP server component via your integrated development environment (IDE) using: npx @agentdeskai/browser-tools-mcp@latest
  3. Launch the supporting backend service in a separate terminal window: npx @agentdeskai/browser-tools-server@latest

Note: IDE configuration methods vary; consult your specific IDE documentation for the correct setup parameters.

CRITICAL DISTINCTION - You must deploy two distinct backend services: - browser-tools-server: A local Node.js process acting as the data acquisition middleware. - browser-tools-mcp: The protocol server installed within your IDE that manages communication between the extension and the local middleware.

When integrating into your IDE, use: npx @agentdeskai/browser-tools-mcp@latest

When running the supporting service in a separate terminal, use: npx @agentdeskai/browser-tools-server@latest

Once these prerequisites are met, activate the BrowserToolsMCP tab within Chrome Developer Tools.

Troubleshooting advice if functionality is impaired: - Completely terminate all instances of the Chrome browser application (not just closing windows). - Restart the local middleware (browser-tools-server). - Verify that only a single instance of the Chrome DevTools panel is active.

If issues persist, please report them so I can provide further diagnostic steps to collect necessary operational data!

If you encounter any obstacles or have suggestions for feature enrichment, please submit an issue ticket or connect with me directly via @tedx_ai on x.

Comprehensive Change Log Detail:

AI assistants like Cursor can now seamlessly execute these analytical scans against the currently displayed webpage. By integrating Puppeteer with the Lighthouse npm library, BrowserTools MCP now facilitates:

  • Compliance validation against WCAG guidelines.
  • Pinpointing performance degradation sources.
  • Identification of on-page Search Engine Optimization deficiencies.
  • Review of adherence to modern web development conventions.
  • Specific vulnerability checks for NextJS implementations.

... all executable without leaving your coding environment! 🎉


🔑 Core Capabilities Added

Assessment Type Functionality Summary
Accessibility WCAG conformance tests covering contrast ratios, missing image descriptions, keyboard navigation traps, ARIA implementation, and more.
Performance Lighthouse-derived analysis targeting render-blocking assets, overly complex DOM structures, image optimization status, and other latency contributors.
SEO Evaluation of critical on-page factors (metadata, heading hierarchy, link structure) with recommendations for enhanced search engine discoverability.
Best Practices Verification against established conventions in contemporary web engineering.
NextJS Audit Execution of a focused prompt suite tailored for NextJS framework adherence.
Audit Mode A sequential execution of all available assessment routines.
Debugger Mode A sequential execution of all available diagnostic routines.

🛠️ Executing Contextual Audits

Prerequisites

Ensure you have:

  • An active browser tab loaded.
  • The BrowserTools extension activated.

▶️ Initiating Assessments

Automated Headless Browsing: Puppeteer drives a non-visual Chrome instance to render the page and gather diagnostic metrics. This ensures reliable data capture, even for pages heavily reliant on JavaScript (SPAs).

The non-visual browser context remains initialized for 60 seconds post-last instruction to optimize for rapid succession of audit calls.

Standardized Output: Every audit returns data encapsulated in a structured JSON object, containing aggregate scores and granular issue reports. This format is optimized for interpretation by MCP-compliant client systems to generate actionable feedback.

The MCP server exposes tools to initiate scans on the current viewport. Below are illustrative commands to trigger these functions:

Accessibility Scan (runAccessibilityAudit)

Verifies conformance to accessibility standards, such as WCAG.

Prompt Examples:

  • "Identify any accessibility failures on this displayed content."
  • "Execute an accessibility review."
  • "Confirm WCAG compliance status."

Performance Scan (runPerformanceAudit)

Uncovers factors impeding page loading speed and responsiveness.

Prompt Examples:

  • "Diagnose the source of slow page loading."
  • "Benchmark this page's operational speed."
  • "Initiate a performance diagnostic."

SEO Scan (runSEOAudit)

Assesses the page's optimization level for search engine indexing.

Prompt Examples:

  • "Provide recommendations for SEO improvement here."
  • "Run an SEO evaluation."
  • "Check current SEO standings."

Best Practices Scan (runBestPracticesAudit)

Checks alignment with established modern web development norms.

Prompt Examples:

  • "Initiate a best practices evaluation."
  • "Review adherence to current standards."
  • "Are there any deviations from best practices present?"

Unified Audit Routine (runAuditMode)

Executes the complete sequence of assessments. Will include the NextJS specific check if the framework signature is detected.

Prompt Examples:

  • "Engage unified audit mode."
  • "Start comprehensive scanning."

NextJS Specific Review (runNextJSAudit)

Checks for framework-specific optimizations and compliance.

Prompt Examples:

  • "Execute the NextJS specific analysis."
  • "Run a NextJS check, I am using the application router."
  • "Run a NextJS check, I am using the pages router."

Debugging Routine (runDebuggerMode)

Runs all available diagnostic tools in sequence.

Prompt Examples:

  • "Activate debugger sequence now."

System Topology

Three primary components collaborate to facilitate browser data capture and analysis:

  1. Chrome Extension: Captures DOM states, console output, network payloads, and screen captures.
  2. Node Service: An intermediary layer mediating data flow between the extension and the MCP interface.
  3. MCP Server: The standardized interface enabling AI clients to invoke browser tools.

┌─────────────┐ ┌──────────────┐ ┌───────────────┐ ┌─────────────┐ │ MCP Client │ ──► │ MCP Server │ ──► │ Node Server │ ──► │ Chrome │ │ (e.g. │ ◄── │ (Protocol │ ◄── │ (Middleware) │ ◄── │ Extension │ │ Cursor) │ │ Interface) │ │ │ │ │ └─────────────┘ └──────────────┘ └───────────────┘ └─────────────┘

Model Context Protocol (MCP) is a feature supported by Anthropic models for defining custom tools accessible by compatible agents. An MCP server deployed by clients like Claude Desktop, Cursor, Cline, or Zed essentially "informs" these clients about the availability of this new toolset.

While these tools can interface with external APIs, a key architectural decision is that all captured logs persist exclusively on your local system and are never transmitted externally. The BrowserTools MCP operates via a local NodeJS API endpoint that communicates securely with the dedicated Chrome Extension.

All entities consuming the BrowserTools MCP Server interface interact with this singular NodeJS API structure.

Chrome Extension Functions

  • Observes XHR/fetch transactions and console events.
  • Tracks the state of currently selected Document Object Model (DOM) nodes.
  • Relays all collected logs and the active element state back to the BrowserTools Connector.
  • Utilizes a Websocket connection for real-time screenshot transmission.
  • Permits user configuration for authentication tokens, data truncation thresholds, and screenshot storage locations.

Node Service Functions

  • Manages the communication bridge between the extension and the MCP server.
  • Receives telemetry (logs, selected element data) from the extension.
  • Executes commands relayed from the MCP server (e.g., capturing a screen or logs).
  • Sends WebSocket instructions to the extension for initiating screen captures.
  • Implements intelligent data reduction strategies (string truncation, duplicate object filtering) to respect token limitations.
  • Strips sensitive data, such as cookies and specific HTTP headers, prior to forwarding data to LLMs in MCP clients.

MCP Server Functions

  • Implements the Model Context Protocol specification.
  • Exposes a standardized set of functional tools for intelligent agents.
  • Ensures compatibility across various MCP consuming applications (Cursor, Cline, Zed, Claude Desktop, etc.).

Deployment Instructions

Detailed installation procedures are available within our formal documentation:

Operationalizing the System

Upon successful installation and configuration, any MCP-compliant assistant gains the capacity to:

  • Intercept and report browser console activity.
  • Intercept and report network transaction details.
  • Capture on-demand screenshots.
  • Inspect and report on designated DOM elements.
  • Issue commands to purge stored logs within the MCP server's scope.
  • Trigger the suite of accessibility, performance, SEO, and best practices analytical routines.

Interoperability

  • Compatible with any client adhering to the MCP standard.
  • Primarily optimized for integration within the Cursor IDE environment.
  • Functions acceptably with other supporting AI editors and MCP endpoints.

WIKIPEDIA NOTE: A headless browser operates without a graphical user interface. These environments enable automated web page manipulation via command line or network communication. They are invaluable for testing because they render content (layout, styling, JS execution) identically to a full browser. Native remote control support arrived in Chrome 59 and Firefox 56, superseding earlier methods like PhantomJS.

The primary applications for headless browsers include:

Web application testing automation (QA). Automated capture of webpage snapshots. Running scripted tests for JavaScript libraries. Automating user interactions with web elements.

=== Secondary Uses === Headless browsers are also employed in web data extraction. Google, for instance, recognized their utility in indexing content reliant on Ajax. Conversely, they have been leveraged nefariously for activities like ad impression inflation or credential stuffing. However, traffic analysis from 2018 suggests no inherent preference by malicious actors for headless environments over standard browsers for attacks like DDoS or XSS.

== Automation Frameworks == As major browsers natively expose headless APIs, several frameworks provide a unified control layer. These include:

Selenium WebDriver – Compliant with W3C WebDriver specifications. Playwright – A Node.js toolkit for automating Chromium, Firefox, and WebKit. Puppeteer – A Node.js library specifically for controlling Chrome or Firefox.

=== Test Automation Context === Many testing harnesses incorporate headless browsing into their execution methodology. Examples:

Capybara utilizes Headless Chrome or WebKit to emulate user actions. Jasmine defaults to Selenium but permits configuration for WebKit or Headless Chrome. Cypress, a dedicated frontend testing framework. QF-Test, a GUI testing tool that supports headless browser execution.

=== Alternative Rendering Approaches === An alternative involves utilizing environments that expose browser-like APIs without full rendering. Deno integrates browser APIs natively. For Node.js, jsdom offers the most comprehensive emulation. While these alternatives often handle parsing, cookies, and XHR efficiently, they typically lack true DOM rendering and event simulation, resulting in faster execution but limited fidelity compared to a headless browser.

See Also

`