mcp-web-auditor
Protocol extension for automated web compliance validation and synthetic browser interaction, leveraging Playwright engine bindings with integrated Axe-core accessibility analysis capabilities. Provides LLMs tools for WCAG evaluation, visual artifact generation, and robust browser state manipulation.
Author

JustasMonkev
Quick Info
Actions
Tags
MCP Web Auditor Extension 🛡️
A Model Context Protocol (MCP) module specialized in rigorous web page quality assurance. This server furnishes language models with capabilities for executing comprehensive Web Content Accessibility Guidelines (WCAG) audits via Axe-core integration, capturing rich, annotated visual proofs, and managing controlled browser sessions using Playwright.
This robust MCP extension facilitates intelligent agents in performing deep compliance vetting, simulating user actions through advanced browser automation, maintaining multi-page contexts, and producing detailed audit artifacts.
Core Capabilities
Accessibility Vetting
✅ Comprehensive WCAG audit execution (2.0, 2.1, 2.2 standards across A, AA, AAA severity levels)
🖼️ Automated artifact creation: Screenshots highlighting detected accessibility faults
📄 Structured JSON output detailing violations and prescriptive remediation steps
🎯 Granular auditing across defined issues: color contrast ratios, ARIA implementation, form control accessibility, keyboard operability, etc.
Synthetic Browser Control
🖱️ Element manipulation: Clicks, hovers, and positional drags based on accessibility tree data
⌨️ Input simulation: Text entry and complex keyboard sequence emulation
🔍 DOM structure profiling via page snapshots to identify all actionable surface elements
📸 High-fidelity raster image capture and PDF document generation
🎯 Support for both semantic element targeting and pixel-coordinate manipulation
Advanced Session Management
📑 Dedicated tab orchestration for complex, multi-view operational flows
🌐 In-flight monitoring of browser console outputs and outbound network traffic
⏱️ Intelligent synchronization controls for asynchronous content loading
📁 Managed file staging/upload mechanisms and interactive dialog interception
🔄 History manipulation (forward/backward navigation)
Deployment Instructions
Install the utility via Node Package Manager: bash npm install -g mcp-web-auditor
VS Code Integration
To register this module within Visual Studio Code:
Standard Installation: bash code --add-mcp '{"name":"web-auditor","command":"npx","args":["mcp-web-auditor"]}'
Insiders Build Installation: bash code-insiders --add-mcp '{"name":"web-auditor","command":"npx","args":["mcp-web-auditor"]}'
Configuration Schema
Configuration for the Claude Desktop environment:
{ "mcpServers": { "web-auditor": { "command": "npx", "args": ["-y", "mcp-web-auditor"] } } }
Customizing Playwright Runtime
Runtime parameters can be injected via a custom configuration file:
{ "mcpServers": { "web-auditor": { "command": "npx", "args": ["-y", "mcp-web-auditor", "--config", "/path/to/custom_config.json"] } } }
Define custom_config.json with behavioral overrides:
{ "browser": { "browserName": "firefox", "launchOptions": { "headless": false, "channel": "firefox" } }, "timeouts": { "navigationTimeout": 90000, "defaultTimeout": 10000 }, "network": { "allowedOrigins": ["trusted-partner.org"], "blockedOrigins": ["telemetry.site"] } }
Available Tuning Parameters:
browser.browserName: Selection of rendering engine (chromium,firefox,webkit)browser.launchOptions.headless: Controls graphical output visibility (default:trueunless display context dictates otherwise)browser.launchOptions.channel: Specific browser distribution variant (e.g., stable, beta)timeouts.navigationTimeout: Upper limit for page transition latency (ms)timeouts.defaultTimeout: Standard wait time for individual element operations (ms)network.allowedOrigins: Whitelist for network destinations (strict blocking if present)network.blockedOrigins: Blacklist of domains to prevent loading
Exposed Toolset
The MCP service exposes the following procedural interfaces for advanced web inspection:
Primary Audit Function
run_accessibility_check
Executes a full WCAG evaluation sweep using Axe-core on the active document state.
Arguments (Parameters):
- targets: List of WCAG/violation designators to enforce during scanning
Recognized Designators:
- WCAG Levels: wcag2a, wcag2aa, wcag2aaa, wcag21a, wcag21aa, wcag21aaa, wcag22a, wcag22aa, wcag22aaa
- Mandates: section508
- Topical Categories: cat.aria, cat.color, cat.forms, cat.keyboard, cat.language, cat.name-role-value, cat.parsing, cat.semantics, cat.sensory-and-visual-cues, cat.structure, cat.tables, cat.text-alternatives, cat.time-and-media
Navigation Primitives
go_to_location
Directs the browser viewport to a specified URI.
- Parameters: url (string)
history_revert
Moves the session history pointer backward.
history_advance
Moves the session history pointer forward.
Interaction Subsystem
capture_dom_snapshot
Generates a structural representation of the current page DOM, ideal for element discovery (superior to pure screenshots for analysis).
activate_element
Simulates a mouse click on a target element.
- Parameters: target_desc (description), reference_id (internal ID), is_double (optional boolean)
inject_text
Enters textual data into an input field.
- Parameters: target_desc, reference_id, input_string, trigger_submit (optional), typing_speed_modifier (optional)
focus_hover
Triggers hover/mouseover states on an element.
- Parameters: target_desc, reference_id
execute_drag_operation
Performs a sequenced movement from a source coordinate/element to a destination.
- Parameters: start_desc, start_ref, end_desc, end_ref
set_dropdown_value
Selects one or more options from a <select> element.
- Parameters: target_desc, reference_id, selected_values (array of strings)
simulate_key_event
Sends a key press event to the active context.
- Parameters: key_code (e.g., 'Enter', 'Tab')
Visual Capture & Output
render_page_image
Captures the current viewport content.
- Parameters: is_raw_bytes (optional boolean), output_file_name (optional), scope_element (optional), scope_ref (optional)
export_as_pdf
Saves the entire rendered page content into a PDF file.
- Parameters: output_file_name (defaults to timestamped file)
Session Administration
terminate_session
Safely closes the active browser tab/context.
adjust_viewport_size
Resizes the browser window dimensions.
- Parameters: pixel_width, pixel_height
Tab Organization
list_active_tabs
Retrieves identifiers for all concurrently open sessions.
open_new_tab
Initializes an auxiliary browser context.
- Parameters: initial_uri (optional)
switch_to_tab
Makes a specific tab the active context.
- Parameters: tab_index
close_specific_tab
Terminates a specified tab (or the active one if index is omitted).
- Parameters: tab_index (optional)
Data Stream Monitoring
get_console_log_entries
Retrieves diagnostic messages logged to the browser console.
fetch_network_transactions
Returns a record of all HTTP/S requests initiated since page load.
Synchronization & Utility
conditional_wait
Halts execution until a condition is met (time delay, element visibility, or element disappearance).
- Parameters: delay_ms (optional), text_present (optional), text_absent (optional)
manage_browser_prompt
Responds to modal system dialogs (alerts, confirmations, input prompts).
- Parameters: action_accept (boolean), prompt_response_text (optional)
inject_file_payload
Simulates a user file selection and upload process.
- Parameters: absolute_paths (array of local file system locations)
Coordinate-Based Precision Control (Vision Mode)
capture_visual_reference
Obtains a screenshot optimized for subsequent coordinate-based targeting.
move_cursor_to
Moves the synthetic cursor to specified coordinates within a scoped element's bounds.
- Parameters: scope_desc, x_offset, y_offset
execute_coordinate_click
Performs a left-click action at precise XY coordinates.
- Parameters: scope_desc, x_offset, y_offset
execute_coordinate_drag
Simulates a continuous mouse movement between two coordinate points.
- Parameters: scope_desc, start_x, start_y, end_x, end_y
inject_text_via_coordinates
Inputs text without relying on direct DOM element references (useful for overlaid elements).
- Parameters: input_string, trigger_submit (optional)
Operational Guidance
Initial Audit Sequence
- Initialize browsing context via go_to_location to target URL.
- Invoke run_accessibility_check specifying validation levels: ["wcag2aa", "cat.color"].
Visual Validation Workflow
- Navigate to the critical page using go_to_location.
- Execute capture_dom_snapshot to map interactive targets.
- Invoke activate_element referencing the 'Login Button' description.
- Apply inject_text for credentials.
- Re-run run_accessibility_check post-submission.
- Finalize with render_page_image for proof documentation.
Handling Asynchronous Loads
- Load initial page.
- Employ conditional_wait ensuring a key dynamic element text becomes visible.
- Proceed with interaction tools.
Crucial Note: For reliable element manipulation (activate_element, inject_text, etc.), always precede interaction attempts by calling capture_dom_snapshot to secure internal reference IDs.
Source Repository
Source code and contribution guidelines are available here: bash git clone https://github.com/JustasMonkev/mcp-accessibility-scanner.git cd mcp-accessibility-scanner npm install
Licensing
Distributed under the MIT License.
--- WIKIPEDIA CONTEXT: Headless Browsers ---
A headless browser operates without a visible graphical user interface, controlled primarily through command lines or network protocols. This mode is invaluable for automated web verification, as it renders HTML, CSS, and executes JavaScript identically to a conventional browser, providing a true simulation environment absent in simpler testing libraries.
Since the maturation of native remote control APIs in modern browsers (Chrome v59+, Firefox v56+), legacy solutions like PhantomJS have largely been supplanted.
== Primary Applications == * Automated testing of contemporary web applications. * Programmatic capture of high-fidelity page screenshots. * Execution of unit/integration tests for JavaScript frontends. * Systematic automation of user-like interactions with web interfaces.
=== Secondary Utilities === Headless agents are often employed for large-scale content harvesting (web scraping). Google notably adopted this approach to index JavaScript-heavy content. However, this capability introduces potential misuse vectors, such as automated ad impression inflation or malicious traffic generation (DDoS). Statistical analyses suggest that malicious actors do not disproportionately favor headless tools over traditional browsers for such activities.
== Automation Frameworks Utilizing Headless Mode == Browser automation is unified through several key interfaces leveraging native headless support:
- Selenium WebDriver (W3C compliant implementation).
- Playwright (Comprehensive automation for Chromium, Firefox, and WebKit).
- Puppeteer (Focused automation targeting Chrome/Firefox engines).
Integrated Testing Tools
Numerous testing harnesses incorporate headless execution into their protocols:
- Capybara (Utilizes Headless Chrome or WebKit).
- Jasmine (Defaulting to Selenium, configurable for headless engines).
- Cypress (A dedicated modern frontend testing framework).
- QF-Test (GUI testing tool supporting headless environments).
=== Architectural Alternatives ===
Alternative methods bypass full rendering, opting for DOM API emulation. Deno integrates browser-like APIs natively. For Node.js environments, jsdom offers extensive HTML parsing and limited event/JavaScript support. While faster, these alternatives lack the fidelity of full DOM rendering and visual event simulation provided by true headless engines.

