🖥️ desktop-visual-analyzer-mcp

An advanced Model Context Protocol (MCP) utility designed to equip intelligent agents with the capacity to grab and semantically process desktop imagery via the Claude Vision inference engine. Obtain visual snapshots, conduct deep environmental interpretation, and receive AI-driven diagnostics on your graphical user interface.

✨ Core Capabilities

📸 Immediate acquisition of the entire display buffer
🧠 Computer Vision analysis powered by Claude's multimodal models
🤖 Smooth integration pathway for other MCP-conforming digital assistants
⚙️ Simplified provisioning and deployment procedures
📡 Native support across both standard I/O (stdio) and Server-Sent Events (SSE) communication channels

🎯 Primary Applications

Visual auditing and interpretation of the active desktop state
Deconstruction and assessment of User Interface layouts and components
Visual debugging through the application of captured screen state
Extracting semantic meaning and context from screen imagery
Automated documentation of graphical elements and spatial arrangements
Visual feedback loops for robotic desktop process execution

🚀 Installation Guide

Preferred Method: npm Distribution

The most robust path for integration is via the Node Package Manager:

bash

Install the globally available utility

npm install -g desktop-visual-analyzer-mcp

For strict reproducibility, pin the exact release version

npm install -g desktop-visual-analyzer-mcp@2.0.15 # Substitute with current version

Subsequent to installation, configure your primary AI client as detailed in the "Configuration Protocol" section.

Configuration Protocol

Once installed via npm, the tool must be registered within your AI client's configuration manifest:

For stdio Transport (Default Mode)

Claude Desktop Client Paths: - Windows: %APPDATA%/Claude/claude_desktop_config.json
- MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Cursor Client Paths: - Windows: %APPDATA%/Cursor/mcp.json or ~/.cursor/mcp.json - MacOS: ~/Library/Application Support/Cursor/mcp.json

CLIne Path: - ~/.config/cline/mcp.json

Windsurf Path: - ~/.config/windsurf/mcp.json

Configuration JSON structure for stdio:

{ "mcpServers": { "desktop-visual-analyzer-mcp": { "command": "npx", "args": [ "desktop-visual-analyzer-mcp@2.0.15" // Pinning ensures execution stability ], "transport": "stdio", "env": { "ANTHROPIC_API_KEY": "your-anthropic-api-key-here" } } } }

For SSE Transport

Applicable for clients featuring SSE compatibility or when establishing a remote communication pipeline:

{ "mcpServers": { "desktop-visual-analyzer-mcp": { "command": "npx", "args": [ "desktop-visual-analyzer-mcp@2.0.15", "--sse", "--port", "8080", "--host", "localhost" ], "env": { "ANTHROPIC_API_KEY": "your-anthropic-api-key-here" } } } }

To connect to a server instance hosted externally:

{ "mcpServers": { "desktop-visual-analyzer-mcp": { "url": "http://your-server-ip:8080/sse", "transport": "sse", "env": { "ANTHROPIC_API_KEY": "your-anthropic-api-key-here" } } } }

🛠️ Exposed Functionality

analyzeScreenContent

Initiates a capture of the current visual state and feeds it into the Vision model for subsequent interpretation.

Parameters (Schema): typescript { prompt?: string; // Customized instructional query for the AI modelName?: string; // Specification of the Claude model instance to invoke saveScreenshot?: boolean; // Local persistence flag for the captured image file }

Demonstration invocation within an AI dialogue context:

Analyze the entirety of my active display and generate a summary focusing on the primary workflow components.

🛑 Diagnostic Procedures

Common Faults

Permission Denied for Capture: Verify that the invoking application possesses requisite operating system permissions for screen recording.
API Credential Failure: Confirm the validity and correct environment variable placement of your Anthropic access token.
Tool Resolution Error: Check global installation status (npm list -g desktop-visual-analyzer-mcp).
Version Skew: Explicitly define the package version in the configuration file to circumvent unexpected dependency caching behaviors.
Communication Mismatch: Ensure the selected transport mechanism aligns with the host client's capabilities.
Claude Desktop mandates stdio.
SSE is available for compatible external consumers.

Transport Compatibility Matrix

Client Application	Supported Protocols
Claude Desktop	stdio
Cursor	stdio, SSE
Cline	stdio, SSE
Windsurf	stdio, SSE

Connection failures usually point to a configuration mismatch in the transport type.

🧑‍💻 Development Workflow

Source Acquisition & Setup: bash git clone https://github.com/yourusername/screen-view-mcp.git cd screen-view-mcp npm install
Compilation: bash npm run build
Local Execution Testing: bash

Testing via serial communication (stdio)

node dist/screen-capture-mcp.js --api-key=your-anthropic-api-key

Testing via network streaming (SSE)

node dist/screen-capture-mcp.js --sse --port 8080 --host localhost --api-key=your-anthropic-api-key

📜 Licensing

MIT

🚀 Smithery Deployment Protocol

This package is optimized for deployment onto the Smithery platform, enabling secure hosting of the MCP server over a WebSocket connection.

Deployment Prerequisites

Repository must contain a functional Dockerfile
Repository must contain a configuration file named smithery.yaml
The Anthropic API Key must be supplied during the configuration phase

Deployment Sequence

Integrate the server definition into the Smithery registry.
Navigate to the Deployment Management interface.
Input your required Anthropic API credentials.
Initiate the deployment process.

Configuration Variables

anthropicApiKey (Mandatory): Your unique Anthropic access credential.
verbose (Optional): Activates extended logging output (Default: false).

Available Functions

helloWorld: A basic diagnostic function returning an echoed message.
analyzeScreenContent: Captures a display image and invokes Claude Vision analysis.

💡 Operational Examples

Executing Visual Analysis

javascript const analysisResult = await mcpClient.invoke("analyzeScreenContent", { prompt: "Examine my screen. What critical information is presented, and how is the application focused?", modelName: "claude-3-opus-20240229" }); console.log(analysisResult);

desktop-visual-analyzer-mcp

Author

hemenge133

Quick Info

Actions

Tags