web-interaction-orchestrator
Facilitate programmatic control over web browser sessions for tasks like URL navigation, visual capture (screenshots), and dynamic script execution within a true browser context. This framework streamlines monitoring of console outputs and content harvesting from rendered web pages.
Author

rdvo
Quick Info
Actions
Tags
Steel Browser Automation Agent
A Model Context Protocol (MCP) service leveraging Puppeteer and Steel to endow large language models with advanced web interaction capabilities. This agent allows AI systems to interface directly with live web environments.
Integrated Capabilities (Tools)
navigate_to_page
Directs the browser instance to a specified Uniform Resource Locator.
- Parameters:
target_url(string, mandatory): The destination web address.load_timeout_ms(number, optional, default: 60000): Maximum wait time for the page load event.trigger_condition(string, optional, default: "domcontentloaded"): Defines when navigation is considered complete. Acceptable values: "load", "domcontentloaded", "networkidle0", "networkidle2".
capture_visual_snapshot
Generates an image file (PNG) of the current viewport or a specified DOM region.
- Parameters:
asset_identifier(string, mandatory): A unique name assigned to the resulting image file.element_locator(string, optional): A specific CSS selector to focus the capture on an element.
perform_user_click
Simulates a standard mouse click event on a target element.
- Parameters:
target_selector(string, mandatory): The CSS selector identifying the clickable element.
inject_input_data
Populates form fields or text areas with specified data.
- Parameters:
field_selector(string, mandatory): The CSS selector for the input control.input_payload(string, mandatory): The text string to insert into the field.
set_select_value
Selects a specific option within an HTML <select> element.
- Parameters:
select_locator(string, mandatory): The CSS selector targeting the dropdown element.option_value(string, mandatory): The value attribute of the option to be chosen.
simulate_mouse_hover
Triggers mouseover events on an element.
- Parameters:
target_selector(string, mandatory): The CSS selector identifying the element to hover over.
execute_browser_script
Runs arbitrary ECMAScript directly within the browser's execution context.
- Parameters:
js_code_block(string, mandatory): The JavaScript code snippet to be executed.
retrieve_page_data
Extracts inner content or HTML from the current page state.
- Parameters:
data_locator(string, optional): A selector to target specific content blocks. If omitted, the entire document body content is returned (subject to token limits).
trigger_page_scroll
Initiates repeated downward scrolling to provoke dynamic content loading (lazy loading).
- Parameters:
scroll_interval_ms(number, optional, default: 100): Pause duration between successive scroll actions.maximum_scroll_iterations(number, optional, default: 50): Limit on the total number of scroll steps performed.
Accessible Artifacts (Resources)
This service exposes two primary artifact streams:
- Browser Diagnostics Stream (
console://logs) -
A continuous textual feed comprising all output emitted to the browser's standard console (errors, warnings, info messages).
-
Visual Assets (
screenshot://<asset_identifier>) - Retrieved PNG image data, referenced by the unique name provided during the
capture_visual_snapshotoperation.
Core Capabilities Summary
- Leverages Puppeteer for reliable browser automation.
- Manages browser sessions via the Steel communication layer.
- Provides real-time console output telemetry.
- Supports high-fidelity visual capture.
- Enables execution of custom JavaScript logic.
- Handles fundamental web interactions: navigation, manipulation, and data entry.
- Offers content extraction with built-in response size management.
- Includes mechanisms for managing infinitely scrolling pages.
Operational Setup & Integration
Integration with Claude Desktop Environment
To enable this functionality within your local Claude Desktop setup, integrate the following definition into your configuration file (e.g., ~/Library/Application Support/Claude/claude_desktop_config.json):
{ "mcpServers": { "web-interaction-orchestrator": { "command": "node", "args": ["path/to/steel-puppeteer/dist/index.js"], "env": { "STEEL_LOCAL": "true" } } } }
Ensure the command path points correctly to the compiled server executable.
Environmental Configuration Variables
The server behavior is controlled by these environment settings:
STEEL_LOCAL(Boolean String, default: "false"): Determines whether to connect to a locally running Steel instance or the managed cloud endpoint.STEEL_API_KEY(Required ifSTEEL_LOCALis "false"): Authentication token necessary for accessing the remote Steel gateway.STEEL_URL(Optional): Override for the default connection endpoint if your Steel deployment is custom-hosted.
Configuration File Guidance (Local Execution)
When launching the process independently, configuration can be managed via environment variables or a .env file placed in the root directory.
Example for Local Steel Connection:
STEEL_LOCAL=true
Example for Cloud Steel Connection:
STEEL_API_KEY=your-secret-key-here STEEL_LOCAL=false
Launch Procedure
- Dependency Installation:
npm install
- Compilation/Bundling:
npm run build
- Server Initialization:
npm start
Upon successful startup, the service will listen on the configured network port (defaulting to 3000), ready for MCP communication.
Troubleshooting Notes
- Puppeteer Failures: Verify that all required system-level dependencies for Puppeteer (e.g., browser binaries) are present. Consult the official Puppeteer troubleshooting documentation.
- Cloud Connectivity: If using the remote Steel service, validate the API key's validity and scope.
- Local Connectivity: Confirm that the local Steel server process is active and reachable at the expected address.
Consult the comprehensive Steel documentation and the underlying Puppeteer API references for further customization and advanced functionality.
