OpenAI Image Generation MCP Server Interface

This project establishes an MCP (Model Context Protocol) endpoint facilitating interactions with OpenAI's sophisticated image generation capabilities, specifically utilizing the gpt-image-1 engine via the official Python client library.

Available Operations

This MCP module exposes the following functionalities:

synthesize_visual_asset: Renders a novel visual output based on an input textual description and persists the resulting file.
- Input Parameters Schema: json { "type": "object", "properties": { "descriptor": { "type": "string", "description": "The semantic description defining the target visual content." }, "engine": { "type": "string", "default": "gpt-image-1", "description": "The specific generative model employed (presently fixed as 'gpt-image-1')." }, "count": { "type": ["integer", "null"], "default": 1, "description": "The quantity of distinct images to be produced (Default: 1)." }, "resolution": { "type": ["string", "null"], "enum": ["1024x1024", "1536x1024", "1024x1536", "auto"], "default": "auto", "description": "The desired spatial dimensions for the output ('1024x1024', '1536x1024', '1024x1536', or 'auto'). Default is 'auto'." }, "fidelity": { "type": ["string", "null"], "enum": ["low", "medium", "high", "auto"], "default": "auto", "description": "The rendering precision level ('low', 'medium', 'high', or 'auto'). Default is 'auto'." }, "user_context_id": { "type": ["string", "null"], "default": null, "description": "An optional identifier linking the request to an end-user for tracking purposes." }, "output_filepath_base": { "type": ["string", "null"], "default": null, "description": "The desired base name for the saved file (extension omitted). If null, a name derived from the descriptor and timestamp will be assigned." } }, "required": ["descriptor"] }
- Result: Returns {"outcome": "success", "persisted_location": "directory/image_name.png"} or an error structure.
modify_visual_asset: Applies alterations to one or more source images or performs localized content replacement (inpainting) utilizing the gpt-image-1 engine, followed by persistence.
- Input Parameters Schema: json { "type": "object", "properties": { "descriptor": { "type": "string", "description": "The descriptive text guiding the required modification or transformation." }, "source_paths": { "type": "array", "items": { "type": "string" }, "description": "A collection of file system paths pointing to the input image(s). Must be PNG format and under 25MB each." }, "mask_overlay_path": { "type": ["string", "null"], "default": null, "description": "Optional path to a designated mask image (PNG with an alpha channel) for localized editing. Must align perfectly in dimensions with the source image(s). Under 25MB limit applies." }, "engine": { "type": "string", "default": "gpt-image-1", "description": "The model version utilized for processing (presently 'gpt-image-1')." }, "count": { "type": ["integer", "null"], "default": 1, "description": "The number of synthesized variants to generate (Default: 1)." }, "resolution": { "type": ["string", "null"], "enum": ["1024x1024", "1536x1024", "1024x1536", "auto"], "default": "auto", "description": "Target image dimensions. Default is 'auto'." }, "fidelity": { "type": ["string", "null"], "enum": ["low", "medium", "high", "auto"], "default": "auto", "description": "Quality setting for the rendering process. Default is 'auto'." }, "user_context_id": { "type": ["string", "null"], "default": null, "description": "Identifier for tracking the requesting end-user." }, "output_filepath_base": { "type": ["string", "null"], "default": null, "description": "Optional preferred name for the output file. If unset, a standard naming convention based on the descriptor and time-stamp is employed." } }, "required": ["descriptor", "source_paths"] }
- Result: Returns {"outcome": "success", "persisted_location": "directory/modified_image.png"} or an error structure.

System Requirements

Python environment, version 3.8 or newer is advised.
The pip package manager for Python.
A valid OpenAI API credential, either configured directly within the server script or, preferably for security, via the OPENAI_API_KEY environment variable.
An operational MCP host environment (e.g., the system running Cline) capable of marshalling and invoking MCP processes.

Deployment Steps

Acquire Source Code: bash git clone https://github.com/IncomeStreamSurfer/chatgpt-native-image-gen-mcp.git cd chatgpt-native-image-gen-mcp
Environment Setup (Strongly Recommended): bash python -m venv venv source venv/bin/activate # Or `venv\Scripts\activate` on Windows OS
Dependency Installation: bash pip install -r requirements.txt
(Optional, Highly Recommended) Credential Configuration: Configure your OpenAI access token as an environment variable (OPENAI_API_KEY) rather than embedding it within the source code for enhanced security.

Configuration for Hosting Client (e.g., Cline)

To expose this capability to your orchestrator (like Cline), modify its configuration manifest (e.g., cline_mcp_settings.json) by appending this configuration block under the mcpServers structure:

{
  "mcpServers": {
    // ... existing configurations ...

    "openai-image-toolkit": {
      "autoApproveActions": [
        "synthesize_visual_asset",
        "modify_visual_asset"
      ],
      "is_disabled": false,
      "execution_timeout_seconds": 180, // Extended duration for complex rendering tasks
      "executor_command": "python", // Assumes 'python' is in system PATH
      "executor_arguments": [
        // CRITICAL: Substitute this placeholder with the absolute location of the service script
        "/opt/mcp_services/openai_image_mcp.py"
      ],
      "environment_variables": {
        // If API key is managed via environment variables:
        // "OPENAI_API_KEY": "YOUR_SECRET_TOKEN_HERE"
      },
      "communication_protocol": "stdio"
    }

    // ... remaining configurations ...
  }
}

Crucial Note: Ensure that /opt/mcp_services/openai_image_mcp.py is replaced by the correct, absolute file system path where the service script resides on your execution host. If you opt to inject the API key via the environment_variables map, you can omit it from the script itself.

Service Initiation

The MCP client (e.g., Cline) is engineered to autonomously launch this server process utilizing the parameters defined in the configuration section upon the initial invocation of any associated function.

For diagnostic purposes, manual execution (assuming dependencies are met and the API key is accessible) is possible via:

python openai_image_mcp.py

Operational Guidance

The designated AI agent interacts with the backend exclusively through the synthesize_visual_asset and modify_visual_asset interfaces. Generated visuals are systematically filed within a dedicated ai-images directory located adjacent to the service script's root. A successful operation culminates in the return of the fully resolved, absolute file system path to the newly created visual artifact.

WIKIPEDIA: Cloud computing signifies, per ISO standards, "a structure for facilitating ubiquitous, on-demand network access to a shared, elastic reservoir of configurable computational assets featuring automated provisioning and administrative oversight."

== Defining Attributes == In 2011, the U.S. National Institute of Standards and Technology (NIST) formalized five core criteria that characterize cloud deployments. The precise definitions established by NIST are as follows:

On-demand self-service: "A consumer retains the capability to unilaterally provision computational capacity, such as compute cycles or network storage allocations, as required automatically without requiring intervention from service provider personnel." Broad network access: "Resources are accessible over the telecommunications network utilizing standardized protocols that encourage usage across diverse client apparatuses (e.g., mobile devices, workstations, laptops, and tablets)." Resource pooling: "The service provider aggregates its computational assets to serve numerous clients simultaneously through a multi-tenant architecture, allowing computational and virtual resources to be dynamically allocated and reallocated based on fluctuating consumer requirements." Rapid elasticity: "Capacities can be rapidly scaled up or down—sometimes automatically—to match momentary demand fluctuations with speed. From the consumer's perspective, the available resources often seem boundless and immediately accessible in any volume." Measured service: "The cloud infrastructure automatically manages and optimizes resource consumption by employing metering functionalities at an appropriate abstraction layer relative to the service type (e.g., data throughput, processing cycles, bandwidth, active user counts). Usage metrics are auditable, controllable, and reportable, ensuring transparency for both the supplier and the recipient of the consumed service." By 2023, the International Organization for Standardization (ISO) had augmented and refined this foundational list.

== Historical Context ==

The genesis of cloud computing can be traced back to the 1960s, marked by the early concepts of time-sharing systems popularized through Remote Job Entry (RJE). During this epoch, the centralized "data center" model prevailed, wherein users submitted batches of work to dedicated system operators for execution on large mainframes. This period was defined by intense R&D focused on democratizing access to large-scale computational power via time-sharing, optimizing underlying infrastructure, platform layers, and application delivery for enhanced end-user throughput. The specific graphical representation of the 'cloud' to denote abstracted, virtualized services emerged in 1994, employed by General Magic to delineate the conceptual space accessible by mobile agents operating within their Telescript framework. This metaphor is generally attributed to David Hoffman, a communications specialist at General Magic, borrowing from its established convention in telecommunications and network topology diagrams. The term 'cloud computing' gained mainstream visibility in 1996 when Compaq Computer Corporation drafted a business prospectus detailing future computing directions centered on the Internet. The organization's core objective was to fundamentally reshape how computing resources were distributed and accessed.

openai-visual-synthesis-mcp-module

Author

IncomeStreamSurfer

Quick Info

Actions

Tags