logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

mcp-web-auditor

Protocol extension for automated web compliance validation and synthetic browser interaction, leveraging Playwright engine bindings with integrated Axe-core accessibility analysis capabilities. Provides LLMs tools for WCAG evaluation, visual artifact generation, and robust browser state manipulation.

Author

mcp-web-auditor logo

JustasMonkev

MIT License

Quick Info

GitHub GitHub Stars 17
NPM Weekly Downloads 1851
Tools 1
Last Updated 2026-02-19

Tags

accessibilityautomationscrapingweb accessibilitybrowser automationautomation web

MCP Web Auditor Extension 🛡️

Security Verification Seal

A Model Context Protocol (MCP) module specialized in rigorous web page quality assurance. This server furnishes language models with capabilities for executing comprehensive Web Content Accessibility Guidelines (WCAG) audits via Axe-core integration, capturing rich, annotated visual proofs, and managing controlled browser sessions using Playwright.

This robust MCP extension facilitates intelligent agents in performing deep compliance vetting, simulating user actions through advanced browser automation, maintaining multi-page contexts, and producing detailed audit artifacts.

Core Capabilities

Accessibility Vetting

✅ Comprehensive WCAG audit execution (2.0, 2.1, 2.2 standards across A, AA, AAA severity levels)
🖼️ Automated artifact creation: Screenshots highlighting detected accessibility faults
📄 Structured JSON output detailing violations and prescriptive remediation steps
🎯 Granular auditing across defined issues: color contrast ratios, ARIA implementation, form control accessibility, keyboard operability, etc.

Synthetic Browser Control

🖱️ Element manipulation: Clicks, hovers, and positional drags based on accessibility tree data
⌨️ Input simulation: Text entry and complex keyboard sequence emulation
🔍 DOM structure profiling via page snapshots to identify all actionable surface elements
📸 High-fidelity raster image capture and PDF document generation
🎯 Support for both semantic element targeting and pixel-coordinate manipulation

Advanced Session Management

📑 Dedicated tab orchestration for complex, multi-view operational flows
🌐 In-flight monitoring of browser console outputs and outbound network traffic
⏱️ Intelligent synchronization controls for asynchronous content loading
📁 Managed file staging/upload mechanisms and interactive dialog interception
🔄 History manipulation (forward/backward navigation)

Deployment Instructions

Install the utility via Node Package Manager: bash npm install -g mcp-web-auditor

VS Code Integration

To register this module within Visual Studio Code:

Standard Installation: bash code --add-mcp '{"name":"web-auditor","command":"npx","args":["mcp-web-auditor"]}'

Insiders Build Installation: bash code-insiders --add-mcp '{"name":"web-auditor","command":"npx","args":["mcp-web-auditor"]}'

Configuration Schema

Configuration for the Claude Desktop environment:

{ "mcpServers": { "web-auditor": { "command": "npx", "args": ["-y", "mcp-web-auditor"] } } }

Customizing Playwright Runtime

Runtime parameters can be injected via a custom configuration file:

{ "mcpServers": { "web-auditor": { "command": "npx", "args": ["-y", "mcp-web-auditor", "--config", "/path/to/custom_config.json"] } } }

Define custom_config.json with behavioral overrides:

{ "browser": { "browserName": "firefox", "launchOptions": { "headless": false, "channel": "firefox" } }, "timeouts": { "navigationTimeout": 90000, "defaultTimeout": 10000 }, "network": { "allowedOrigins": ["trusted-partner.org"], "blockedOrigins": ["telemetry.site"] } }

Available Tuning Parameters:

  • browser.browserName: Selection of rendering engine (chromium, firefox, webkit)
  • browser.launchOptions.headless: Controls graphical output visibility (default: true unless display context dictates otherwise)
  • browser.launchOptions.channel: Specific browser distribution variant (e.g., stable, beta)
  • timeouts.navigationTimeout: Upper limit for page transition latency (ms)
  • timeouts.defaultTimeout: Standard wait time for individual element operations (ms)
  • network.allowedOrigins: Whitelist for network destinations (strict blocking if present)
  • network.blockedOrigins: Blacklist of domains to prevent loading

Exposed Toolset

The MCP service exposes the following procedural interfaces for advanced web inspection:

Primary Audit Function

run_accessibility_check

Executes a full WCAG evaluation sweep using Axe-core on the active document state.

Arguments (Parameters): - targets: List of WCAG/violation designators to enforce during scanning

Recognized Designators: - WCAG Levels: wcag2a, wcag2aa, wcag2aaa, wcag21a, wcag21aa, wcag21aaa, wcag22a, wcag22aa, wcag22aaa - Mandates: section508 - Topical Categories: cat.aria, cat.color, cat.forms, cat.keyboard, cat.language, cat.name-role-value, cat.parsing, cat.semantics, cat.sensory-and-visual-cues, cat.structure, cat.tables, cat.text-alternatives, cat.time-and-media

go_to_location

Directs the browser viewport to a specified URI. - Parameters: url (string)

history_revert

Moves the session history pointer backward.

history_advance

Moves the session history pointer forward.

Interaction Subsystem

capture_dom_snapshot

Generates a structural representation of the current page DOM, ideal for element discovery (superior to pure screenshots for analysis).

activate_element

Simulates a mouse click on a target element. - Parameters: target_desc (description), reference_id (internal ID), is_double (optional boolean)

inject_text

Enters textual data into an input field. - Parameters: target_desc, reference_id, input_string, trigger_submit (optional), typing_speed_modifier (optional)

focus_hover

Triggers hover/mouseover states on an element. - Parameters: target_desc, reference_id

execute_drag_operation

Performs a sequenced movement from a source coordinate/element to a destination. - Parameters: start_desc, start_ref, end_desc, end_ref

set_dropdown_value

Selects one or more options from a <select> element. - Parameters: target_desc, reference_id, selected_values (array of strings)

simulate_key_event

Sends a key press event to the active context. - Parameters: key_code (e.g., 'Enter', 'Tab')

Visual Capture & Output

render_page_image

Captures the current viewport content. - Parameters: is_raw_bytes (optional boolean), output_file_name (optional), scope_element (optional), scope_ref (optional)

export_as_pdf

Saves the entire rendered page content into a PDF file. - Parameters: output_file_name (defaults to timestamped file)

Session Administration

terminate_session

Safely closes the active browser tab/context.

adjust_viewport_size

Resizes the browser window dimensions. - Parameters: pixel_width, pixel_height

Tab Organization

list_active_tabs

Retrieves identifiers for all concurrently open sessions.

open_new_tab

Initializes an auxiliary browser context. - Parameters: initial_uri (optional)

switch_to_tab

Makes a specific tab the active context. - Parameters: tab_index

close_specific_tab

Terminates a specified tab (or the active one if index is omitted). - Parameters: tab_index (optional)

Data Stream Monitoring

get_console_log_entries

Retrieves diagnostic messages logged to the browser console.

fetch_network_transactions

Returns a record of all HTTP/S requests initiated since page load.

Synchronization & Utility

conditional_wait

Halts execution until a condition is met (time delay, element visibility, or element disappearance). - Parameters: delay_ms (optional), text_present (optional), text_absent (optional)

manage_browser_prompt

Responds to modal system dialogs (alerts, confirmations, input prompts). - Parameters: action_accept (boolean), prompt_response_text (optional)

inject_file_payload

Simulates a user file selection and upload process. - Parameters: absolute_paths (array of local file system locations)

Coordinate-Based Precision Control (Vision Mode)

capture_visual_reference

Obtains a screenshot optimized for subsequent coordinate-based targeting.

move_cursor_to

Moves the synthetic cursor to specified coordinates within a scoped element's bounds. - Parameters: scope_desc, x_offset, y_offset

execute_coordinate_click

Performs a left-click action at precise XY coordinates. - Parameters: scope_desc, x_offset, y_offset

execute_coordinate_drag

Simulates a continuous mouse movement between two coordinate points. - Parameters: scope_desc, start_x, start_y, end_x, end_y

inject_text_via_coordinates

Inputs text without relying on direct DOM element references (useful for overlaid elements). - Parameters: input_string, trigger_submit (optional)

Operational Guidance

Initial Audit Sequence

  1. Initialize browsing context via go_to_location to target URL.
  2. Invoke run_accessibility_check specifying validation levels: ["wcag2aa", "cat.color"].

Visual Validation Workflow

  1. Navigate to the critical page using go_to_location.
  2. Execute capture_dom_snapshot to map interactive targets.
  3. Invoke activate_element referencing the 'Login Button' description.
  4. Apply inject_text for credentials.
  5. Re-run run_accessibility_check post-submission.
  6. Finalize with render_page_image for proof documentation.

Handling Asynchronous Loads

  1. Load initial page.
  2. Employ conditional_wait ensuring a key dynamic element text becomes visible.
  3. Proceed with interaction tools.

Crucial Note: For reliable element manipulation (activate_element, inject_text, etc.), always precede interaction attempts by calling capture_dom_snapshot to secure internal reference IDs.

Source Repository

Source code and contribution guidelines are available here: bash git clone https://github.com/JustasMonkev/mcp-accessibility-scanner.git cd mcp-accessibility-scanner npm install

Licensing

Distributed under the MIT License.

--- WIKIPEDIA CONTEXT: Headless Browsers ---

A headless browser operates without a visible graphical user interface, controlled primarily through command lines or network protocols. This mode is invaluable for automated web verification, as it renders HTML, CSS, and executes JavaScript identically to a conventional browser, providing a true simulation environment absent in simpler testing libraries.

Since the maturation of native remote control APIs in modern browsers (Chrome v59+, Firefox v56+), legacy solutions like PhantomJS have largely been supplanted.

== Primary Applications == * Automated testing of contemporary web applications. * Programmatic capture of high-fidelity page screenshots. * Execution of unit/integration tests for JavaScript frontends. * Systematic automation of user-like interactions with web interfaces.

=== Secondary Utilities === Headless agents are often employed for large-scale content harvesting (web scraping). Google notably adopted this approach to index JavaScript-heavy content. However, this capability introduces potential misuse vectors, such as automated ad impression inflation or malicious traffic generation (DDoS). Statistical analyses suggest that malicious actors do not disproportionately favor headless tools over traditional browsers for such activities.

== Automation Frameworks Utilizing Headless Mode == Browser automation is unified through several key interfaces leveraging native headless support:

  • Selenium WebDriver (W3C compliant implementation).
  • Playwright (Comprehensive automation for Chromium, Firefox, and WebKit).
  • Puppeteer (Focused automation targeting Chrome/Firefox engines).

Integrated Testing Tools

Numerous testing harnesses incorporate headless execution into their protocols:

  • Capybara (Utilizes Headless Chrome or WebKit).
  • Jasmine (Defaulting to Selenium, configurable for headless engines).
  • Cypress (A dedicated modern frontend testing framework).
  • QF-Test (GUI testing tool supporting headless environments).

=== Architectural Alternatives === Alternative methods bypass full rendering, opting for DOM API emulation. Deno integrates browser-like APIs natively. For Node.js environments, jsdom offers extensive HTML parsing and limited event/JavaScript support. While faster, these alternatives lack the fidelity of full DOM rendering and visual event simulation provided by true headless engines.

See Also

`