Ollama-MCP-Gateway: Local LLM Integration Hub

This MCP Gateway establishes a secure, standardized conduit between MCP-compliant clients and locally hosted Ollama instances, offering advanced capabilities for task decomposition, result validation, and operational pipeline management.

Core Capabilities: - Task Segmentation: Breaking down monolithic computational requests into manageable sub-units. - Verification & Scoring: Assessing generated artifacts against predefined success metrics. - Model Lifecycle Control: Managing and invoking accessible Ollama models. - Protocol Adherence: Ensuring strict communication via the Model Context Protocol. - Resilience: Implementing sophisticated exception handling with explicit diagnostic reporting. - Efficiency Tuning: Employing resource optimization techniques (e.g., persistent connection management, Least Recently Used caching).

Resource Manifestation

The gateway exposes the following logical resource endpoints via distinct URI schemes: - task://: Access point for individual processing assignments. - result://: Access point for evaluated computational outputs. - model://: Access point for querying currently available Ollama architectures.

Each resource path is associated with requisite metadata and content type indicators crucial for optimal LLM interaction.

Conceptual Mapping: Prompts vs. Tools

Within this gateway's architecture, prompts and tools maintain distinct, yet interdependent, functions:

Prompt (Schema Role): Dictates the required cognitive structure or input format for the LLM's inference process.
Tool (Handler Role): Represents the executable function or system capability invoked by the protocol.

Every operative tool mandates an associated schema (prompt) to effectively bridge the LLM's reasoning capacity with tangible system actions.

Predefined Prompts (Schemas)

decompose-task
Purpose: To segment complex objectives into granular, executable steps.
Input: Task description and level of detail (granularity).
Output: A structured breakdown detailing dependencies and estimated complexity.
evaluate-result
Purpose: To measure an artifact's quality against specified validation criteria.
Input: The output artifact and evaluation parameters.
Output: A graded assessment accompanied by actionable refinement suggestions.

Implemented Tools (Handlers)

add-task
Parameters: name (string, required), description (string, required); optional priority (number), deadline (string), tags (array).
Action: Registers a new assignment in the internal system, returning its unique ID.
decompose-task
Parameters: task_id (string, required), granularity (enum: "high"|"medium"|"low", required); optional max_subtasks (number).
Action: Leverages Ollama to segment the referenced complex task.
evaluate-result
Parameters: result_id (string, required), criteria (object, required); optional detailed (boolean).
Action: Executes the evaluation protocol on the specified output.
run-model
Parameters: model (string, required), prompt (string, required); optional temperature (number), max_tokens (number).
Action: Executes a direct inference call on the specified Ollama backend.

Operational Enhancements

Advanced Error Reporting

The gateway yields enriched, structured error payloads, enabling client applications to implement precise recovery logic. Example structure for failure notification:

{ "error": { "message": "Assignment identifier not located: task-123", "status_code": 404, "details": { "provided_id": "task-123" } } }

Performance Tuning Parameters

Configuration via config.py allows fine-grained control over throughput:

python

Performance Configuration Settings

cache_size: int = 100 # Maximum entries retained in response cache max_connections: int = 10 # Global limit for concurrent outbound HTTP sessions max_connections_per_host: int = 10 # Per-origin limit for concurrent sessions request_timeout: int = 60 # Session time-out threshold in seconds

Model Specification Flexibility

Resolution Hierarchy

The active LLM architecture is determined via a strict priority mechanism:

Explicit parameter within a Tool invocation (model argument).
Configuration entry within the MCP configuration file's env segment.
System Environment Variable (OLLAMA_DEFAULT_MODEL).
Hardcoded fallback (llama3).

Configuration File Injection

For integrated environments (e.g., Claude Desktop), model specification can be injected via the client's MCP configuration JSON:

{ "mcpServers": { "ollama-MCP-server": { "command": "python", "args": [ "-m", "ollama_mcp_server" ], "env": [ {"model": "llama3:latest"} ] } } }

Validation and Discovery

Upon startup, the gateway validates the existence of configured models, logging warnings if any are absent. The run-model tool can be queried to return a current manifest of discoverable models, aiding users in selecting valid targets.

Validation Protocol

The repository includes a comprehensive testing suite covering: - Unit Tests: Verification of isolated functional units. - Integration Tests: End-to-end workflow simulation.

Execution commands:

bash

Full Test Execution

python -m unittest discover

Targeted Integration Test Execution

python -m unittest tests.test_integration

Configuration Directives

Environment Variables

Variable	Default	Description
`OLLAMA_HOST`	`http://localhost:11434`	Base URL for the Ollama API service.
`DEFAULT_MODEL`	`llama3`	Fallback model name for direct calls.
`LOG_LEVEL`	`info`	Verbosity setting for operational logging.

Ollama Prerequisites

Ensure Ollama runtime is installed and necessary models are pre-fetched:

bash

Install Ollama if missing

curl -fsSL https://ollama.com/install.sh | sh

Download primary models

ollama pull llama3 ollama pull mistral

Deployment Guide (Quick Start)

Installation

bash pip install ollama-mcp-server

Client Configuration Paths (e.g., Claude Desktop)

macOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json

Non-Released/Development Server Registration

"mcpServers": { "ollama-MCP-server": { "command": "uv", "args": [ "--directory", "/path/to/ollama-MCP-server", "run", "ollama-MCP-server" ], "ENV":[{"model":"deepseek:r14B"}] } }

Production Server Registration

"mcpServers": { "ollama-MCP-server": { "command": "uvx", "args": [ "ollama-MCP-server" ] } }

Usage Examples

Task Decomposition Invocation

To partition a complex requirement:

python result = await mcp.use_mcp_tool({ "server_name": "ollama-MCP-server", "tool_name": "decompose-task", "arguments": { "task_id": "task://123", "granularity": "medium", "max_subtasks": 5 } })

Result Evaluation Call

To subject an output to quality assessment:

python evaluation = await mcp.use_mcp_tool({ "server_name": "ollama-MCP-server", "tool_name": "evaluate-result", "arguments": { "result_id": "result://456", "criteria": { "accuracy": 0.4, "completeness": 0.3, "clarity": 0.3 }, "detailed": true } })

Direct Model Inference

For direct queries against an Ollama backend:

python response = await mcp.use_mcp_tool({ "server_name": "ollama-MCP-server", "tool_name": "run-model", "arguments": { "model": "llama3", "prompt": "Explain quantum computing in layman's terms", "temperature": 0.7 } })

Development Workflow

Project Initialization

Repository acquisition: bash git clone https://github.com/yourusername/ollama-MCP-server.git cd ollama-MCP-server
Virtual environment setup and activation: bash python -m venv venv source venv/bin/activate # Or venv\Scripts\activate on Windows
Install development dependencies (assuming uv is the package manager): bash uv sync --dev --all-extras

Local Execution Scripts

Server Launch: bash ./run_server.sh # Options: --debug, --log=LEVEL
Testing Execution: bash ./run_tests.sh # Options: --unit, --integration, --all (default), --verbose

Packaging and Distribution

Dependency synchronization: bash uv sync
Artifact creation: bash uv build

Output in dist/ directory (sdist and wheel)

Publication to PyPI (requires configured credentials): bash uv publish

Debugging

Since the MCP server often operates via stdio pipes, direct debugging can be challenging. We highly recommend utilizing the official MCP Inspector for the best debugging experience.

To initiate the Inspector using npm:

bash npx @modelcontextprotocol/inspector uv --directory /path/to/ollama-MCP-server run ollama-mcp-server

The Inspector will subsequently output a URL accessible via a web browser for interactive debugging.

Architecture Overview

(Placeholder for detailed architectural diagram/description)

Collaboration Guidelines

Contributions are actively encouraged! Initiate development by submitting a Pull Request (PR).

Fork the repository.
Create a feature branch (git checkout -b feat/new-capability).
Commit changes clearly (git commit -m 'Feat: Implemented widget serialization').
Push the branch (git push origin feat/new-capability).
Open a formal PR.

Licensing

This software is distributed under the MIT License. Refer to the LICENSE file for comprehensive terms.

Acknowledgements

The Model Context Protocol team for furnishing the foundational protocol specification.
The Ollama project for democratizing accessible local LLM execution.
All developers who have contributed to this gateway.

ollama-context-protocol-gateway

Author

NewAITees

Quick Info

Actions

Tags