logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

mcp-llm-service-deployer

Establishes bespoke Large Language Model serving endpoints leveraging YAML manifests for configuration, granting regulated access to arbitrary assets and structured instructional phrasing. It enables capability enhancement via invoking external utilities while strictly adhering to the Machine Chat Protocol.

Author

mcp-llm-service-deployer logo

phil65

MIT License

Quick Info

GitHub GitHub Stars 5
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

llmlingllmserversllm serversserver llmlingcustom llm

mcp-server-llmling

PyPI License Package status Monthly downloads Distribution format Wheel availability Python version Implementation Releases Github Contributors Github Discussions Github Forks Github Issues Github Issues Github Watchers Github Stars Github Repository size Github last commit Github release date Github language count Github commits this month Package status PyUp

Read the documentation!

LLMling Server Manual

Overview

mcp-server-llmling facilitates the establishment of a server adhering to the Machine Chat Protocol (MCP), utilizing a YAML-driven infrastructure definition system tailored for Large Language Model operations.

LLMLing, the underlying execution engine, furnishes a configuration mechanism based on YAML blueprints for LLM workflows. It enables the instantiation of custom MCP gateways that serve synthesized artifacts defined within YAML specifications.

  • Declarative Setup: Define the operational landscape for the LLM using YAML; coding is optional for basic definitions.
  • MCP Compliance: Engineered directly upon the Machine Chat Protocol for standardized conversational interfacing.
  • Component Taxonomy:
  • Artifacts (Resources): Sources of consumable data (e.g., documents, literal text blocks, external command outputs, etc.).
  • Templates (Prompts): Predefined message structures incorporating variable placeholders.
  • Capabilities (Tools): Python functions made callable by the language model agent.

The resulting YAML structure dictates a comprehensive operational context, furnishing the LLM agent with: - Mechanisms for accessing defined data artifacts. - Standardized input phrasing via prompt templates. - Mechanisms to augment functionality through callable utilities.

Key Features

1. Artifact Management

  • Ingestion and administration of diverse artifact modalities:
  • Local file system entries (PathResource)
  • Embedded textual content (TextResource)
  • Shell command execution outputs (CLIResource)
  • Source code files (SourceResource)
  • Results derived from Python invocation (CallableResource)
  • Visual data representations (ImageResource)
  • Provisions for monitoring artifact states and enabling live hot-reloading.
  • Support for artifact preprocessing workflows.
  • Access routing based on Uniform Resource Identifiers (URIs).

2. Utility System

  • Registration and invocation of arbitrary Python routines as LLM-accessible instruments.
  • Compatibility layer for OpenAPI-defined instrumentation specifications.
  • Automated discovery of utilities via defined entry points.
  • Automated validation checks for utility signatures and input arguments.
  • Support for machine-interpretable utility outcome formats.

3. Phrasing System

  • Static prompt definitions featuring templating capabilities.
  • Dynamic prompt generation sourced from Python routines.
  • Prompt definitions persisted in external files.
  • Validation logic applied to prompt substitution variables.
  • Suggestion engine for populating prompt argument fields.

4. Multiple Communication Pathways

  • Standard input/output stream communication (the default mechanism).
  • Server-Sent Events (SSE) or streaming HTTP interfaces for asynchronous web-based consumers.
  • Extensibility hooks for integrating novel transport layer implementations.

Operation Instructions

Integration with Zed Editor

Integrate LLMLing as a context-providing service within your settings.json configuration:

{
  "context_servers": {
    "llmling": {
      "command": {
        "env": {},
        "label": "llmling",
        "path": "uvx",
        "args": [
          "mcp-server-llmling",
          "start",
          "path/to/your/config.yml"
        ]
      },
      "settings": {}
    }
  }
}

Configuration for Claude Desktop

Configure the LLMLing service within your claude_desktop_config.json manifest:

{
  "mcpServers": {
    "llmling": {
      "command": "uvx",
      "args": [
        "mcp-server-llmling",
        "start",
        "path/to/your/config.yml"
      ],
      "env": {}
    }
  }
}

Direct Server Initiation

Execute the service directly from the terminal interface:

# Obtain the most current iteration
uvx mcp-server-llmling@latest

1. Programmatic Invocation

from llmling import RuntimeConfig
from mcp_server_llmling import LLMLingServer

async def main() -> None: # Renamed to maintain context
    async with RuntimeConfig.open(config) as runtime: # Renamed config variable for flow clarity
        service_instance = LLMLingServer(runtime, enable_injection=True)
        await service_instance.start()

asyncio.run(main())

2. Utilizing a Custom Transport Mechanism

from llmling import RuntimeConfig
from mcp_server_llmling import LLMLingServer

async def main() -> None: # Renamed to maintain context
    async with RuntimeConfig.open(config) as runtime: # Renamed config variable for flow clarity
        service_instance = LLMLingServer(
            config,
            transport="sse",
            transport_options={
                "host": "localhost",
                "port": 3001,
                "cors_origins": ["http://localhost:3000"]
            }
        )
        await service_instance.start()

asyncio.run(main())

3. Artifact Definition in Configuration

resources:
  python_code:
    type: path
    path: "./src/**/*.py"
    watch: 
      enabled: true
      patterns: 
        - "*.py"
        - "!**/__pycache__/**"

  api_docs:
    type: text
    content: | 
      API Documentation 
      ================
      ...

4. Utility Specification in Configuration

tools:
  analyze_code:
    import_path: "mymodule.tools.analyze_code"
    description: "Assess the structural composition of Python source code"

toolsets:
  api:
    type: openapi
    spec: "https://api.example.com/openapi.json"

[!TIP] For OpenAPI specifications, consider deploying Redocly CLI to aggregate and resolve schema references prior to deployment with LLMLing. This ensures complete specification fidelity and correct formatting. If Redocly is detected on the system path, it will be invoked automatically.

Service Manifest Structure

The service is initialized via a YAML declaration file featuring the subsequent top-level segments:

global_settings:
  timeout: 30
  max_retries: 3
  log_level: "INFO"
  requirements: []
  pip_index_url: null
  extra_paths: []

resources:
  # Definitions for data artifacts...

tools:
  # Definitions for callable utilities...

toolsets:
  # Definitions for utility groupings...

prompts:
  # Definitions for standardized phrasing...

MCP Interface Specification

The service rigorously implements the MCP standard, which mandates support for the following functional domains:

  1. Artifact Operations
  2. Enumeration of accessible artifacts.
  3. Retrieval of artifact content payloads.
  4. Subscription to artifact state evolution notifications.

  5. Utility Operations

  6. Listing of registered executable instruments.
  7. Execution of instruments with specified arguments.
  8. Fetching instrument definition schemas.

  9. Phrasing Operations

  10. Cataloging of available prompt templates.
  11. Obtaining fully resolved prompt messages.
  12. Acquiring input value suggestions for prompt parameters.

  13. System Feedback

  14. Alerts concerning artifact alterations.
  15. Updates regarding instrument or prompt catalog changes.
  16. Operational progress indicators.
  17. Transmitted diagnostic logs.

WIKIPEDIA: XMLHttpRequest (XHR) is an Application Programming Interface within JavaScript, implemented as an object, that facilitates the transmission of HTTP requests from a client-side web environment to an origin server. These methods allow web applications to initiate server communications subsequent to initial page rendering and subsequently process incoming data. XMLHttpRequest forms a foundational element of Ajax methodologies. Prior to its widespread adoption, server interaction heavily relied on traditional hyperlinking and form submission mechanisms, often resulting in a complete page refresh.

== Genesis == The foundational concept underpinning XMLHttpRequest was formulated in the year 2000 by the development team behind Microsoft Outlook. This notion was subsequently integrated into the Internet Explorer 5 browser release (1999). However, the initial syntax did not utilize the explicit XMLHttpRequest identifier. Instead, developers employed the COM object instantiations: ActiveXObject("Msxml2.XMLHTTP") and ActiveXObject("Microsoft.XMLHTTP"). By the time Internet Explorer 7 (2006) launched, universal support for the standard XMLHttpRequest identifier was established across all major browser platforms, including Mozilla's Gecko rendering engine (2002), Safari 1.2 (2004), and Opera 8.0 (2005).

=== Standardization Efforts === The World Wide Web Consortium (W3C) published its initial Working Draft specification for the XMLHttpRequest object on April 5, 2006. A subsequent Level 2 Working Draft was released on February 25, 2008, introducing capabilities for monitoring request progress events, enabling cross-origin communication, and handling raw byte streams. Towards the close of 2011, the Level 2 features were formally incorporated into the primary specification document. As of the end of 2012, stewardship of development transitioned to the WHATWG, which currently maintains an active, evolving specification utilizing Web IDL notation.

== Utilization Protocol == Generally, issuing a request via XMLHttpRequest involves a sequence of distinct programming phases.

  1. Instantiate an XMLHttpRequest object by invoking its constructor:
  2. Invoke the open method to define the transmission methodology (request type), designate the target resource endpoint, and select between synchronous or asynchronous execution modes:
  3. For asynchronous operations, define an event handler (listener) responsible for reacting to subsequent state transitions in the request lifecycle:
  4. Commence the actual transmission by calling the send method, optionally packaging payload data:
  5. Monitor the state changes within the registered event handler. Upon successful response receipt, the object's state transitions to 4, signifying the "done" state, with the server's returned payload typically residing in the responseText attribute. Beyond these essential steps, XMLHttpRequest provides extensive control mechanisms over transmission behavior and response parsing. Custom header fields can be appended to tailor server fulfillment logic, and data streams can be uploaded by supplying content to the send call. The received payload can be automatically parsed from JSON format into native JavaScript objects, or processed incrementally as data arrives rather than waiting for complete transmission. Furthermore, requests can be terminated prematurely or subject to a time-out constraint.

== Inter-Domain Communication ==

During the initial evolution phase of the World Wide Web, mechanisms permitting cross-origin data exchange were found to potentially introduce security vulnerabilities, leading to restrictions on this functionality.

See Also

`