Browser-Side Tool Exchange Protocol (WebMCP)
A standardized mechanism enabling web portals to furnish client-side Large Language Models (LLMs) with access to local utilities, data sources, and pre-defined conversational inputs, thereby augmenting user experience interactivity without requiring external API credential management. This framework facilitates inter-site connections for creating more comprehensive agent capabilities.
Author

jasonjmcghee
Quick Info
Actions
Tags
WebMCP: Enabling Client-Side LLM Augmentation via Web Protocols
A standardized specification and accompanying codebase enabling interactive web platforms to provision functionality and context to resident LLM agents.
WebMCP empowers web domains to expose operational utilities, informational assets, and structured query templates directly to integrated LLMs. Essentially, it permits a website to function as a localized MCP service provider. This methodology entirely circumvents the necessity of sharing sensitive API access credentials. Users retain autonomy over which underlying computational models they employ.
A demonstration implementation of a WebMCP-integrated website is accessible here
This is packaged as an embeddable interface element that site administrators can deploy to expose local capabilities, furnishing client-side LLMs with the necessary components for delivering superior user interactions or automated agent actions.
The interface's visual presentation, user interaction paradigm, and security posture remain subjects for ongoing collaborative refinement. A primary objective is for MCP Client software to natively incorporate WebMCP handling functionality.
An end-user's computational agent can simultaneously interface with numerous WebMCP-enabled origins; tooling definitions are namespace-isolated (via naming conventions) and logically bound to their originating host domain for clarity and conflict avoidance.
Rapid Proof-of-Concept Demonstration (Approx. 20 seconds, Audio recommended 🔊)
https://github.com/user-attachments/assets/61229470-1242-401e-a7d9-c0d762d7b519
Initial Setup (Configuring Your LLM Client to Interact with WebMCP Sites)
Installation of Client Interface
Specify your desired MCP client implementation (e.g., claude, cursor, cline, windsurf, or a configuration file path):
bash npx -y @jason.today/webmcp@latest --config claude
For manual setup of the command-line interface, execute: npx -y @jason.today/webmcp@latest --mcp.
The automatic discovery feature was conceptually derived from Smithery, but due to their AGPL licensing, a distinct implementation was pursued. Should auto-discovery fail or your client not appear, please submit an issue report.
Utilizing WebMCP
To establish a secure link with an active website, instruct your LLM to generate a unique session authorization token. This token must be input into the website's designated portal. Upon successful server-side registration, the token is immediately invalidated and rendered useless for subsequent connection attempts. The website then receives its own ephemeral session credential for subsequent data exchange.
If minimizing the token's exposure to the primary computational agent is preferred, the token can be generated independently via the terminal command: npx @jason.today/webmcp --new.
Certain specialized LLM interfaces, such as Claude Desktop (as of this writing), necessitate a service restart to fully recognize newly acquired capabilities.
To terminate connectivity, simply close the associated browser viewport, invoke the 'disconnect' control within the interface, or halt the local server process using: npx @jason.today/webmcp -q.
All operational parameters and state artifacts are persistently stored within the ~/.webmcp directory structure.
Integrating WebMCP into Your Web Platform
To enable WebMCP functionality on a web page, integrate the primary script file webmcp.js via a standard <script> tag:
This inclusion automatically bootstraps the WebMCP interaction module, which manifests as a discrete graphical element positioned in the lower-right quadrant of the viewport. Activation of this element prompts the user to supply the previously generated WebMCP authorization token.
Comprehensive Walkthrough (Approx. 3 minutes)
https://github.com/user-attachments/assets/43ad160a-846d-48ad-9af9-f6d537e78473
Underlying Architectural Details
The communication nexus linking the MCP client agent and the hosting website relies upon a WebSocket server restricted exclusively to the localhost interface (rendering it impervious to external network exploitation). Authentication via token exchange is mandatory, leveraging the local browser's security context, to prevent unauthorized access by potentially malicious third-party origins.
The ideal long-term solution would involve browser vendors implementing a specific, granular permission model for this communication channel, analogous to existing controls for camera or microphone access.
- The primary MCP Agent establishes a persistent connection to the
/mcppath, authenticating with a server-side key (auto-generated and stored in.env). - The server issues a unique registration artifact (instigated either by the LLM invoking a built-in utility or by the
--newterminal command). - Web client interfaces connect to the designated
/registerendpoint, presenting this artifact along with their domain identity. (The registration artifact is immediately retired after use). - The specific website is then subscribed to a dedicated communication channel scoped to its domain.
- When the LLM initiates a request to utilize a defined tool, resource, or prompt, the data transmission path is:
- MCP Client → Primary Server → WebSocket Transport Layer → Target Web Page exposing the utility/data
- (The reverse path applies when requesting metadata, such as a catalog of available resources)
- The web page executes the required operation (e.g., tool invocation) and relays the resulting data back through the established pipeline.
- Concurrent sessions from multiple distinct web pages can operate simultaneously, each utilizing its unique set of registered capabilities and session tokens.
- The MCP Client consolidates all received tool definitions into a singular, unified inventory, employing domain-specific prefixes on tool names to preempt potential naming conflicts.
mermaid sequenceDiagram participant User participant MCP as MCP Client participant Server as MCP Server participant WS as WebSocket Server participant Web as Website
%% Initial connection
MCP->>Server: Initiate link to /mcp using internal secret
%% Website registration token
User->>MCP: Solicit registration credential
MCP->>Server: Query for registration credential
Server-->>MCP: Dispatch registration credential
MCP-->>User: Display credential for transfer
%% Website registration
User->>Web: Paste credential
Web->>WS: Establish connection to /register with credential & host ID (Credential subsequently purged)
WS-->>Web: Allocate channel and session key
Web->>WS: Subscribe to assigned channel
%% Tool interaction
MCP->>Server: Query for available utility catalog
Server->>WS: Relay query
WS->>Web: Request tool definitions
Web-->>WS: Return catalog structure
WS-->>Server: Forward catalog structure
Server-->>MCP: Return catalog structure
%% Tool execution
MCP->>Server: Submit request for tool operation
Server->>WS: Relay operational request
WS->>Web: Invoke requested utility
Web-->>WS: Transmit operational outcome
WS-->>Server: Forward outcome
Server-->>MCP: Return final result
%% Disconnection
User->>Web: Terminate session
Web->>WS: Terminate transport link
Security Considerations
This project is in its nascent phase. Significant effort is dedicated to fortifying the system against exploitation vectors, particularly those involving prompt injection attacks originating from untrustworthy browser extensions or content. Input regarding hardening measures is highly valued; please contribute via issue tracking or direct contact.
Intrinsic Utilities
Provided command-line functionalities include:
- Credential Generator (for establishing connectivity with WebMCP-enabled interfaces)
- MCP Schema Definition Helper (simplifies the articulation of tool specifications required for MCP integration)
- Capability to request the corresponding JavaScript implementation code upon request, if relevant to the WebMCP context.
Containerization (Docker)
An explicit Dockerfile is present, tailored for deployment environments mirroring Smithery's structure.
For orchestrating the WebSocket transmission layer via Docker, a sample docker-compose.yml file has been included for reference.
If the --docker flag is utilized alongside the --mcp flag within the client configuration, the system presumes the server component is containerized and running. This permits the encapsulation of the core WebSocket broker within a Docker environment, while the MCP client connects to the containerized service endpoint via WebSocket. Correspondingly, web interactions will route through this Dockerized broker.
WIKIPEDIA: XMLHttpRequest (XHR) is an API realized as a JavaScript object designed to transmit Hypertext Transfer Protocol (HTTP) requests from a web user agent to a remote server. Its methods permit web-based applications to dispatch asynchronous queries post-page load and subsequently process incoming data. XMLHttpRequest is a foundational element of asynchronous JavaScript and XML (Ajax) programming methodologies. Prior to Ajax adoption, standard hyperlink navigation and form submissions constituted the principal means of server communication, frequently necessitating a complete page refresh.
== Chronology ==
The genesis of the XMLHttpRequest concept traces back to the year 2000, conceived by developers associated with Microsoft Outlook. This principle was subsequently actualized within the Internet Explorer 5 browser release (1999). However, the initial syntax did not utilize the XMLHttpRequest identifier; instead, developers employed the COM object instantiations ActiveXObject("Msxml2.XMLHTTP") and ActiveXObject("Microsoft.XMLHTTP"). By the time Internet Explorer 7 launched (2006), universal support for the XMLHttpRequest identifier was established across major platforms.
The XMLHttpRequest identifier has since solidified its position as the ubiquitous standard across all contemporary web browsers, including those powered by Mozilla's Gecko rendering engine (2002), Safari 1.2 (2004), and Opera 8.0 (2005).
=== Standardization Efforts === The World Wide Web Consortium (W3C) issued the initial Working Draft specification for the XMLHttpRequest object on April 5, 2006. On February 25, 2008, the W3C published the Level 2 specification draft, which introduced enhanced functionality such as progress monitoring methods, mechanisms for facilitating cross-origin requests, and support for processing raw byte streams. By the close of 2011, the Level 2 feature set had been formally integrated back into the primary specification document. At the conclusion of 2012, development stewardship transitioned to the WHATWG, which maintains the living document using Web IDL definitions.
== Operational Usage == The general procedure for dispatching a request using XMLHttpRequest necessitates several sequential programming steps.
Instantiate an XMLHttpRequest object by invoking its constructor:
Invoke the open method to define the request modality, specify the target resource Uniform Resource Identifier (URI), and select between synchronous or asynchronous execution:
For an asynchronous operation, establish an event handler that will be triggered upon changes to the request's transactional state:
Commence the transmission by calling the send method:
Monitor state transitions within the assigned event listener. Upon reception of server data, it is typically stored in the responseText property by default. When the object completes its processing cycle, its state transitions to 4, signifying completion ("done") state.
Beyond these foundational steps, XMLHttpRequest offers extensive configuration options influencing request transmission and response handling. Custom header fields can be appended to tailor server expectations, and payload data can be transmitted to the server via arguments passed to the send call. Responses can be parsed directly from JSON format into operational JavaScript structures, or processed incrementally as data packets arrive rather than awaiting full receipt. The request can be preemptively terminated or configured with a timeout threshold.
