Image and Video Generation MCP Repositories
135 repositories in this category.
MCP-LOGO-GEN
→
Logo generation using AI tools, including features for image creation, background removal, and automatic scaling for high-quality outputs in various sizes.
ComfyUI
→
A visual graph-based interface for designing and executing advanced stable diffusion pipelines, enabling users to create complex workflows without coding. It features smart memory management and asynchronous processing, supporting both GPU and CPU usage for offline functionality.
omniparser-autogui-mcp
→
Analyzes the screen using OmniParser to automatically operate graphical user interfaces. It provides capabilities for interpreting visual content and executing GUI actions based on analysis.
CnOCR-TextExtractor
→
A comprehensive Python toolkit engineered for robust Optical Character Recognition (OCR) across Chinese scripts, the Latin alphabet, and numerical sequences. It facilitates utilization of pre-trained recognition systems or supports user-defined model calibration, delivering advanced text extraction capabilities for diverse computational vision pipelines.
flux-schnell-server
→
Provides an MCP protocol-based API for generating images from text prompts with customizable dimensions and reproducible results using a specified random seed. Supports asynchronous streaming responses and integration with Hugging Face model services.
mcp-mavae
→
A Model Context Protocol (MCP) server for interacting with image media tools, providing capabilities for image generation, editing, and management of collections and models.
mcp_media_generator
→
Create images using the Amazon Nova Canvas model and videos using the Amazon Nova Reel model. Connects to existing tools for media generation and storage.
mcp-image-recognition
→
Leverages image recognition capabilities to analyze and describe images using advanced vision APIs. Supports multiple formats and allows for optional text extraction from images.
cos-mcp
→
Integrate large language models with Tencent Cloud Object Storage (COS) and Data Insight (CI), enabling file management, automated cloud data handling, and various image and video processing tasks. Supports natural language-based metadata search and efficient backup workflows.
ImageOnC
→
Implement vehicle license plate recognition using C/C++ on FPGA, utilizing OpenCV for image display and Eigen for optimized matrix operations. The project includes code for training neural networks and processing license plate images.
mcp-veo2
→
Generates high-quality videos from text prompts or images using Google's Veo2 model and provides access to these generated videos through MCP resources.
gpt-image-1-mcp
→
Enables AI assistants to generate and edit images from text prompts, supporting both creation and modification of images using specified masks. Integrates with various MCP clients and provides flexible workflows for image handling, including automatic file saving and comprehensive error reporting.
tinypng-mcp-server
→
Compress images efficiently using the TinyPNG API. Supports both local and remote image compression with minimal setup required.
game-asset-mcp
→
Generates 2D and 3D game assets from text prompts using AI models. Integrates with Hugging Face Spaces for asset generation, facilitating rapid prototyping for game developers.
image-generator-mcp-server
→
Generates images based on prompts using OpenAI's DALL-E model, saving them in a specified directory on the user's desktop.
imagen3-mcp
→
Generate high-quality images using Google's Imagen 3.0 model through an MCP interface, facilitating integration with tools like Cherry Studio or Cursor. Supports configurable deployment options using a Google Gemini API key.
mcp-templateio
→
Generates customized visuals by creating images based on templates using the Templated.io API. Supports dynamic graphics creation through user-provided text and image URLs.
mcp-server-gemini-image-generator
→
Generate high-quality images from text prompts using the Gemini AI model, manage local image storage, and facilitate creative modifications of existing images.
DiffuGen
→
Seamlessly generate AI images directly within development environments by leveraging local Stable Diffusion models and precise control over parameters. Integrate with MCP-compatible IDEs to facilitate creative development without disruption.
tupianyasuo
→
A front-end image compression tool supporting various formats like PNG and JPG, enabling users to customize compression ratios and preview results in real-time. The application allows users to download optimized images with comparisons of file sizes before and after compression.
mcp-server-amazon-bedrock
→
Integrates with Amazon Bedrock's Nova Canvas model to generate high-quality images based on text descriptions. Provides advanced features for refining image composition through negative prompts and allows control over image dimensions and quality.
mcp-imagegen
→
Generate images from text prompts using advanced AI models. Supports both local and SSE endpoint configurations with specific provider requirements.
pixabay-mcp
→
Connect to the Pixabay API to search for images and retrieve formatted results that include image URLs and metadata. Handle errors seamlessly during API interactions for reliable performance.
StyleCLIP
→
A CLIP-based fashion recommendation system that enables users to upload clothing images and receive similar clothing tag recommendations through an interactive web interface. It utilizes YOLO for clothing detection and integrates seamlessly with an MCP framework.
aws-nova-canvas-mcp
→
Generate and edit images with advanced features such as text-to-image generation, image inpainting, and background removal, using the Nova Canvas model from Amazon Bedrock.
mcp-image-downloader
→
Provides tools for downloading images from URLs and performing basic image optimization tasks such as resizing, quality adjustment, and format conversion.
jina-ai-mcp-multimodal-search
→
Seamless integration with Jina AI's neural search capabilities enables semantic, image, and cross-modal searches through a simple interface. Perform searches based on natural language queries, visual similarities, and text-to-image or image-to-text conversions.
GarbageSorting
→
Identify and classify waste using image and voice recognition techniques to streamline the recycling process and enhance environmental awareness.
MCPollinations
→Generates images, text, and audio from prompts using the Pollinations APIs. It supports returning images as base64-encoded data and allows listing available models for image and text generation.
mcp-flux-schnell
→
Generate images from text descriptions using the Flux Schnell model through an MCP interface. This server connects with Cloudflare's Flux Schnell worker API to deliver image generation capabilities.
