
This repository furnishes an automated web interaction framework built upon Playwright. It integrates deeply with Cloudflare's infrastructure, specifically utilizing Browser Rendering Services and the specialized @cloudflare/playwright module for edge deployment.
Execute the subsequent commands to initialize and publish the service:
- Obtain necessary components:
npm ci
- Compile project assets:
cd cloudflare
npm run build
- Deploy to the Cloudflare Workers Environment:
cd cloudflare/example
npx wrangler deploy
To connect this browser automation server with the Cloudflare AI Playground for LLM interaction:
- Navigate to https://playground.ai.cloudflare.com/
- Configure the active model to
llama-3.3-70b-instruct-fp8-fast for optimal performance.
- Within the MCP Servers section, supply the endpoint URL:
https://[my-mcp-url].workers.dev/sse
- Initiate the connection by clicking Connect.
- Verification is successful when the status shows Connected and reports the availability of 14 distinct tools.
Once linked, the assistant can execute specified browser actions based on user requests.
[!TIP]
For maximal precision, confine user prompts to single, discrete operational requests, e.g., "Mark the first item as complete," "Load the homepage," or "Capture the current screen state."
Illustrative Interaction Sequence:
| Participant |
Utterance |
| User |
Navigate to the TodoMVC demo at demo.playwright.dev/todomvc |
| Assistant |
invokes browser_navigate tool Page transition successful to https://demo.playwright.dev/todomvc/. Page title is "React • TodoMVC". Snapshot confirms page structure: header, input field, and navigation links are present. |
| User |
Add an item to the list in a pirate accent |
| Assistant |
invokes browser_type tool New todo "Shiver me timbers, I need grog" added. Snapshot reflects the update, including the new list entry. |
| User |
Now remove all completed entries |
| Assistant |
invokes browser_click tool The "Clear completed" control has been activated. Page snapshot confirms removal of any previously finished tasks. |
| User |
Capture the screen now |
| Assistant |
invokes browser_take_screenshot tool A visual record of the present state has been secured. Image format is PNG, size approx. 52KB. |
Support for remote MCP endpoints in Claude Desktop requires tunneling via the mcp-remote utility. Configure Claude Desktop using the following settings under the mcpServers configuration block:
{
"mcpServers": {
"cloudflare-playwright-remote": {
"command": "npx",
"args": [
"mcp-remote",
"https://[my-mcp-url].workers.dev/sse"
]
}
}
}
Remember to reload Claude Desktop after modifying the configuration file.
To integrate this server directly within the Visual Studio Code environment for Copilot agent use:
# Standard VSCode
code --add-mcp '{"name":"cloudflare-playwright","type":"sse","url":"https://[my-mcp-url].workers.dev/sse"}'
# VSCode Insiders
code-insiders --add-mcp '{"name":"cloudflare-playwright","type":"sse","url":"https://[my-mcp-url].workers.dev/sse"}'
The available tools function across two distinct operational modes:
- Accessibility Snapshot Mode (Default): Prioritizes ARIA-based accessibility trees for superior throughput and stability.
- Visual (Vision) Mode: Relies on rendered screen captures, best suited for models capable of coordinate-based interaction.
Interaction Primitives
- **browser_snapshot**
- Title: Retrieve Accessibility Tree
- Description: Obtain the current page's accessibility tree structure; superior substitute for visual capture when aiming for element interaction.
- Parameters: None
- Read-only: **true**
- **browser_click**
- Title: Execute Click Action
- Description: Initiate a mouse click event on a specified webpage element.
- Parameters:
- `element` (string): Descriptive label aiding element identification for authorization.
- `ref` (string): Unique token referencing the precise element within the current snapshot.
- `doubleClick` (boolean, optional): Designates if a dual-click sequence should be performed.
- Read-only: **false**
- **browser_drag**
- Title: Perform Drag Operation
- Description: Execute a simulated mouse drag-and-drop sequence between a source and destination component.
- Parameters:
- `startElement` (string): Descriptive label for the source element.
- `startRef` (string): Reference token for the source element.
- `endElement` (string): Descriptive label for the destination element.
- `endRef` (string): Reference token for the destination element.
- Read-only: **false**
- **browser_hover**
- Title: Initiate Mouse Hover
- Description: Position the cursor over a designated element.
- Parameters:
- `element` (string): Descriptive label aiding element identification for authorization.
- `ref` (string): Unique token referencing the precise element.
- Read-only: **true**
- **browser_type**
- Title: Input Text Sequence
- Description: Enter a string of characters into an interactive input field.
- Parameters:
- `element` (string): Descriptive label aiding element identification for authorization.
- `ref` (string): Unique token referencing the precise element.
- `text` (string): The characters to be entered.
- `submit` (boolean, optional): If true, simulates pressing 'Enter' post-entry.
- `slowly` (boolean, optional): If true, input is serialized character-by-character.
- Read-only: **false**
- **browser_select_option**
- Title: Choose Dropdown Option(s)
- Description: Select one or more values from a list presented in a standard dropdown control.
- Parameters:
- `element` (string): Descriptive label aiding element identification for authorization.
- `ref` (string): Unique token referencing the precise element.
- `values` (array): List of values intended for selection.
- Read-only: **false**
- **browser_press_key**
- Title: Emit Keyboard Event
- Description: Simulate the physical depression of a singular key (e.g., 'Escape', 'Control', or a character like 'k').
- Parameters:
- `key` (string): The identifier for the key to be pressed.
- Read-only: **false**
- **browser_wait_for**
- Title: Pause Execution
- Description: Halt operation pending the manifestation/removal of specific text or for a defined duration.
- Parameters:
- `time` (number, optional): Duration to pause, measured in seconds.
- `text` (string, optional): Text content expected to become visible.
- `textGone` (string, optional): Text content expected to vanish from view.
- Read-only: **true**
- **browser_file_upload**
- Title: Inject Local Files
- Description: Transfer one or multiple local files into an active web form input.
- Parameters:
- `paths` (array): Absolute file system paths pointing to the uploads.
- Read-only: **false**
- **browser_handle_dialog**
- Title: Manage Browser Alert/Prompt
- Description: Respond to modal dialog boxes (alerts, confirms, prompts) displayed by the page.
- Parameters:
- `accept` (boolean): Determines whether to confirm (true) or dismiss (false) the dialog.
- `promptText` (string, optional): The input string to supply if the dialog is a prompt type.
- Read-only: **false**
Session Control
- **browser_navigate**
- Title: Direct to Web Address
- Description: Command the browser to load a specified Uniform Resource Locator (URL).
- Parameters:
- `url` (string): The destination URL for navigation.
- Read-only: **false**
- **browser_navigate_back**
- Title: Step Backwards
- Description: Execute the browser's history 'Back' function.
- Parameters: None
- Read-only: **true**
- **browser_navigate_forward**
- Title: Step Forwards
- Description: Execute the browser's history 'Forward' function.
- Parameters: None
- Read-only: **true**
Artifact Generation
- **browser_take_screenshot**
- Title: Capture Screen Image
- Description: Generate a visual representation (image) of the current viewport state. For element-specific actions, use snapshot tools.
- Parameters:
- `raw` (boolean, optional): If true, output is uncompressed PNG; otherwise, default is compressed JPEG.
- `filename` (string, optional): Desired output filename; defaults to a timestamped name.
- `element` (string, optional): Descriptive text for an element to focus the capture on.
- `ref` (string, optional): Reference token corresponding to the element capture focus.
- Read-only: **true**
- **browser_pdf_save**
- Title: Output Content as PDF
- Description: Render the current document content and save it as a Portable Document Format (PDF) file.
- Parameters:
- `filename` (string, optional): Desired output filename; defaults to a timestamped PDF name.
- Read-only: **true**
- **browser_network_requests**
- Title: Inventory Network Traffic
- Description: Retrieve a complete ledger of all network transmissions initiated since page load.
- Parameters: None
- Read-only: **true**
- **browser_console_messages**
- Title: Extract Console Logs
- Description: Fetch all messages recorded in the browser's developer console.
- Parameters: None
- Read-only: **true**
Maintenance & Environment
- **browser_install**
- Title: Install Required Browser Binaries
- Description: Manually trigger the download and setup of the browser environment specified in the configuration, useful when encountering missing binary errors.
- Parameters: None
- Read-only: **false**
- **browser_close**
- Title: Terminate Browser Session
- Description: Shut down the current active browser instance/page.
- Parameters: None
- Read-only: **true**
- **browser_resize**
- Title: Adjust Viewport Dimensions
- Description: Dynamically modify the pixel dimensions of the browser's visible area.
- Parameters:
- `width` (number): Target horizontal dimension in pixels.
- `height` (number): Target vertical dimension in pixels.
- Read-only: **true**
Tab Management
- **browser_tab_list**
- Title: Enumerate Open Tabs
- Description: Produce a roster of all currently accessible browser tabs.
- Parameters: None
- Read-only: **true**
- **browser_tab_new**
- Title: Initiate New Tab
- Description: Open an additional browser tab, optionally directing it to a specific URL upon creation.
- Parameters:
- `url` (string, optional): The destination to load in the new tab; defaults to a blank page.
- Read-only: **true**
- **browser_tab_select**
- Title: Focus Specific Tab
- Description: Make a tab active based on its numerical position in the tab list.
- Parameters:
- `index` (number): The zero-based index of the target tab.
- Read-only: **true**
- **browser_tab_close**
- Title: Shut Down Tab
- Description: Close a specified tab, or the currently active one if no index is supplied.
- Parameters:
- `index` (number, optional): The index of the tab to terminate.
- Read-only: **false**
Code Generation
- **browser_generate_playwright_test**
- Title: Draft Playwright Test Script
- Description: Automatically compose a unit test file conforming to the Playwright testing framework based on a described user scenario.
- Parameters:
- `name` (string): Identifier for the generated test case.
- `description` (string): Narrative summary of the test's purpose.
- `steps` (array): A sequenced list detailing the actions the test must perform.
- Read-only: **true**
Vision Mode Overrides
- **browser_screen_capture**
- Title: Render Current Viewport
- Description: Capture a raw pixel image of the browser window.
- Parameters: None
- Read-only: **true**
- **browser_screen_move_mouse**
- Title: Recalibrate Pointer Position
- Description: Instruct the cursor to move to specific (X, Y) coordinates on the screen.
- Parameters:
- `element` (string): Descriptive label aiding element identification for authorization (contextual awareness).
- `x` (number): Horizontal coordinate position.
- `y` (number): Vertical coordinate position.
- Read-only: **true**
- **browser_screen_click**
- Title: Activate Pixel Location
- Description: Initiate a left-button click event at precise screen coordinates.
- Parameters:
- `element` (string): Descriptive label aiding element identification for authorization.
- `x` (number): Horizontal coordinate of the click target.
- `y` (number): Vertical coordinate of the click target.
- Read-only: **false**
- **browser_screen_drag**
- Title: Execute Screen Drag
- Description: Simulate dragging an object across the screen by defining start and end coordinates.
- Parameters:
- `element` (string): Descriptive label aiding element identification for authorization.
- `startX` (number): Initial horizontal coordinate.
- `startY` (number): Initial vertical coordinate.
- `endX` (number): Final horizontal coordinate.
- `endY` (number): Final vertical coordinate.
- Read-only: **false**
- **browser_screen_type**
- Title: Inject Text Visually
- Description: Enter text input directly, relying on visual feedback.
- Parameters:
- `text` (string): The character sequence to input.
- `submit` (boolean, optional): If true, simulates pressing 'Enter' post-input.
- Read-only: **false**
- **browser_press_key**
- Title: Emit Key Press (Vision)
- Description: Simulate the depression of a singular keyboard key.
- Parameters:
- `key` (string): The identifier for the key to be pressed.
- Read-only: **false**
- **browser_wait_for**
- Title: Visual Pause Timer
- Description: Delay execution based on time, or waiting for specific textual content on screen.
- Parameters:
- `time` (number, optional): Duration to pause, in seconds.
- `text` (string, optional): Text content expected to be visible.
- `textGone` (string, optional): Text content expected to vanish.
- Read-only: **true**
- **browser_file_upload**
- Title: Vision Mode File Injection
- Description: Transfer local files into the active web element.
- Parameters:
- `paths` (array): Absolute file paths for the upload(s).
- Read-only: **false**
- **browser_handle_dialog**
- Title: Process Dialog (Vision)
- Description: Confirm or deny a visible dialog box.
- Parameters:
- `accept` (boolean): Confirmation decision.
- `promptText` (string, optional): Text to use if responding to a prompt dialog.
- Read-only: **false**