Cloud Browser Execution Agent (BrowserCat Backend Interface)

This Model Context Protocol (MCP) server component establishes an interface for controlling BrowserCat's proprietary cloud-based web browsing facility. It empowers Large Language Models (LLMs) to dynamically interact with web content, capture visual representations of the display, and run client-side scripts inside a fully rendered browser environment without requiring any local browser installation or setup.

Available Operations (Tools)

`browsercat_navigate`

Function: Direct the browser to a specified Uniform Resource Locator (URL).
Parameter: url (Type: Text String) - The target web address.

`browsercat_screenshot`

Function: Procure a visual snapshot of the entire displayed page or isolate a specific element.
Parameters:
- name (Type: Text String, Mandatory): A unique identifier for storing the resulting image file.
- selector (Type: Text String, Optional): A Cascading Style Sheets (CSS) query used to pinpoint a specific DOM element.
- width (Type: Numeric, Optional, Default: 800 units): The horizontal dimension for the captured image.
- height (Type: Numeric, Optional, Default: 600 units): The vertical dimension for the captured image.

`browsercat_click`

Function: Simulate a primary mouse click action on a targeted page component.
Parameter: selector (Type: Text String) - The CSS designator for the target component.

`browsercat_hover`

Function: Simulate moving the mouse cursor over a specified element.
Parameter: selector (Type: Text String) - The CSS designator for the element to hover over.

`browsercat_fill`

Function: Input specified text data into an interactive form field element.
Parameters:
- selector (Type: Text String) - The CSS designator identifying the input control.
- value (Type: Text String) - The textual datum to be entered.

`browsercat_select`

Function: Choose a specific option from a standard HTML selection box (dropdown).
Parameters:
- selector (Type: Text String) - The CSS designator pointing to the <select> tag.
- value (Type: Text String) - The specific option value intended for selection.

`browsercat_evaluate`

Function: Execute arbitrary JavaScript code within the context of the active browser window.
Parameter: script (Type: Text String) - The block of JavaScript source code to interpret and run.

Accessible Artifacts (Resources)

The server exposes two principal forms of retrievable data artifacts:

Diagnostic Stream (console://logs)
- Content: Plain textual representation of all messages generated by the browser's internal console.
- Scope: Captures all standard console output events (e.g., info, warnings, errors).
Visual Records (screenshot://<name>)
- Content: Portable Network Graphics (PNG) image files.
- Access: Retrieved using the unique identifier supplied when the screenshot operation was initiated.

Core Capabilities Summary

Operation entirely hosted in the cloud infrastructure.
Eliminates the necessity for local deployment of rendering engines.
Capability to monitor and record runtime console diagnostics.
On-demand image capture functionality.
Direct execution environment for client-side scripting.
Fundamental interaction routines (e.g., traversing, activating elements, population of input fields).

Setup Requirements for Cloud Browser Executor Agent

Environmental Configuration

The agent necessitates the setting of one critical environment variable for authentication and service access:

BROWSERCAT_API_KEY: Your unique access credential for the BrowserCat service (Mandatory). Obtainable from https://browsercat.xyz/mcp.

Deployment Configuration (NPX Example)

{
  "mcpServers": {
    "browsercat": {
      "command": "npx",
      "args": ["-y", "@browsercatco/mcp-server"],
      "env": {
        "BROWSERCAT_API_KEY": "your-secret-api-key-here"
      }
    }
  }
}

Licensing Terms

This specialized server component is distributed under the MIT License. This grants extensive permissions to utilize, modify, and disseminate the software, contingent upon adherence to the specified terms outlined in the accompanying LICENSE document.