logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

oai-visual-synthesis-mcp-adapter

A Model Context Protocol service enabling graphical asset creation and modification utilizing cutting-edge OpenAI models, specifically GPT-4o and the gpt-image-1 variant. Offers granular control via textual directives and outputs results either locally persisted or returned as Base64 payloads for seamless integration with MCP-compliant consuming agents.

Author

oai-visual-synthesis-mcp-adapter logo

SureScaleAI

MIT License

Quick Info

GitHub GitHub Stars 74
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

openaigptmcpopenai gptgpt imageimage mcp

oai-visual-synthesis-mcp-adapter

MCP SDK OpenAI SDK License GitHub stars Build Status


This component implements the Model Context Protocol (MCP) interface to proxy interactions with OpenAI's latest visual processing endpoints, including GPT-4o and gpt-image-1 capabilities for generation and transformation.

  • Synthesize Imagery: Produce novel visual content from descriptive textual inputs leveraging state-of-the-art OpenAI architectures.
  • Modify Existing Visuals: Execute complex image manipulations (inpainting, expansion, merging) governed by precise narrative specifications.
  • Compatibility: Fully operational with primary MCP clients such as Claude Desktop, Cursor IDE integration, VSCode extensions, Windsurf, and any system adhering to the MCP standard.

✨ Core Capabilities

  • create-image: Initiate image synthesis based on a command prompt, supporting configurable parameters (resolution, fidelity, atmospheric context, etc.).
  • edit-image: Apply targeted alterations or extensions to supplied images using a prompt and an optional positional mask; input sources accept filesystem references or embedded Base64 data.
  • Output Handling: Images are either persisted to the local filesystem or relayed back to the caller as a Base64 string.

🚀 Deployment Guide

sh git clone https://github.com/SureScaleAI/openai-gpt-image-mcp.git cd openai-gpt-image-mcp yarn install yarn build


🔑 Configuration Directives

Configure this service within your target MCP environment (e.g., Claude Desktop or VSCode/Cursor/Windsurf settings):

{ "mcpServers": { "oai-visual-synthesis-mcp-adapter": { "command": "node", "args": ["/absolute/path/to/dist/index.js"], "env": { "OPENAI_API_KEY": "sk-your-key-here" } } } }

Alternatively, for Azure OpenAI deployments:

{ "mcpServers": { "oai-visual-synthesis-mcp-adapter": { "command": "node", "args": ["/absolute/path/to/dist/index.js"], "env": { "AZURE_OPENAI_API_KEY": "sk-azure-key", "AZURE_OPENAI_ENDPOINT": "my.azure.service.net", "OPENAI_API_VERSION": "2024-12-01-preview" } } } }

Environment variables can also be sourced from an external file:

{ "mcpServers": { "oai-visual-synthesis-mcp-adapter": { "command": "node", "args": ["/absolute/path/to/dist/index.js", "--env-file", "./deployment/.env"] } } }


⚡ Advanced Parameters

  • For create-image operations, the n parameter permits batch generation up to ten distinct outputs.
  • When utilizing edit-image, supplying a mask image (via path reference or embedded data) precisely defines the modification locus.
  • Leverage the --env-file path/to/file/.env argument to load sensitive data.
  • Comprehensive parameter listings are detailed within the src/index.ts source file.

🧑‍💻 Development Workflow

  • Source Code Location: src/index.ts
  • Compilation Process: yarn build
  • Execution Command: node dist/index.js

📝 Licensing

MIT Agreement


🩺 Debugging Guidance

  • Verify the validity of your OPENAI_API_KEY and confirm it has requisite image generation permissions.
  • Access to the image APIs often requires a validated OpenAI organization; allow 15–20 minutes post-verification for enablement.
  • Filesystem paths must adhere strictly to absolute referencing standards:
  • Unix/macOS/Linux: Must commence with / (e.g., /system/data/asset.jpg)
  • Windows: Must begin with a recognized drive letter followed by a colon (e.g., D:/assets/pic.png or D:\assets\pic.png)
  • Ensure write permissions exist for the intended file saving destination.
  • Image format errors usually indicate a mismatch between the provided file extension and the actual MIME type.

⚠️ Constraints & Large Data Strategy

  • Response Size Ceiling: MCP interfaces, notably Claude Desktop, enforce a strict 1MB constraint on tool response payloads. High-resolution or multiple generated visuals, when returned as Base64, frequently breach this ceiling.
  • Automatic Fallback: Should the collective size of generated assets exceed 1MB, the tool transparently shifts output mode to filesystem saving, returning file path(s) instead of encoded binary data. This mitigates errors like result exceeds maximum length of 1048576.
  • Default Storage: If no explicit file_output location is defined, assets are written to /tmp (or the directory specified by the MCP_HF_WORK_DIR environment variable) using uniquely generated names.
  • Environment Variable for Control:
  • MCP_HF_WORK_DIR: Designate a custom path for saving large outputs. Example: export MCP_HF_WORK_DIR=/secure/storage/location
  • Recommended Practice: For high-volume or mission-critical visual assets, utilize file output exclusively and confirm consuming clients are configured to resolve path references.

📚 Supplementary Information


🙏 Acknowledgment


Note on HTTP Requests: In web development, the XMLHttpRequest (XHR) object furnishes methods for transferring HTTP communications between a web browser and a backend server. This capability permits browser-based applications to submit queries post-page load and process incoming data asynchronously. XHR is foundational to Ajax methodologies. Before its advent, server interaction typically mandated page refreshes via standard links or form submissions.

== Background == The genesis of the XMLHttpRequest concept dates back to 2000, originating with Microsoft Outlook developers, leading to its initial embedding in Internet Explorer 5 (1999). The original implementation used proprietary COM object identifiers, specifically ActiveXObject("Msxml2.XMLHTTP") and ActiveXObject("Microsoft.XMLHTTP"). By the release of Internet Explorer 7 (2006), the standardized XMLHttpRequest identifier achieved universal browser support across major platforms, including Mozilla's Gecko (2002), Safari 1.2 (2004), and Opera 8.0 (2005).

=== Standardization Track === The World Wide Web Consortium (W3C) issued its initial Working Draft specification for the XMLHttpRequest object in April 2006. A subsequent Level 2 specification followed in February 2008, augmenting the API with event progress tracking, cross-site request facilitation, and byte stream handling. By late 2011, Level 2 features were merged back into the primary specification. Development responsibility was transitioned to WHATWG near the close of 2012, where it is now maintained as a dynamic document utilizing Web IDL.

== Operational Flow == Executing a server request via XMLHttpRequest generally involves a defined sequence of programming steps:

  1. Instantiation: Create an instance of the XMLHttpRequest object via its constructor.
  2. Configuration (open): Invoke the open method to specify the HTTP method, target URI, and whether the operation should block (synchronous) or proceed concurrently (asynchronous).
  3. Listener Setup (Async Only): For asynchronous modes, attach an event handler to monitor state transitions.
  4. Transmission (send): Trigger the request payload transmission using the send method.
  5. Response Handling: Monitor the state changes in the registered listener. Upon reaching state 4 (the 'done' state), response data is accessible, typically within the responseText attribute.

Beyond these fundamental steps, XHR provides extensive control over request serialization and response parsing. Custom headers can be injected, data uploaded via the send argument, and responses processed immediately as they arrive rather than waiting for completion. Premature termination or setting timeouts are also supported features.

== Cross-Origin Communication ==

Early in the World Wide Web's evolution, limitations were encountered regarding the ability to brea

See Also

`