mcp-advanced-reasoning-augmentation-engine
Leverages alternative LLMs to refine user inputs by applying structured inference methodologies like Chain of Draft (CoD) or Chain of Thought (CoT), generating concise, high-signal intermediate artifacts to optimize subsequent task resolution and minimize computational overhead.
Author

brendancopley
Quick Info
Actions
Tags
Model Context Protocol Reasoning Accelerator (CoD/CoT)
System Functionality Summary
The Reasoning Accelerator functions as an MCP utility designed to preemptively structure complex queries for Large Language Models (LLMs) using advanced decomposition techniques, specifically Chain of Draft (CoD) or Chain of Thought (CoT). The primary objective is boosting cognitive fidelity of the LLM's output while aggressively curtailing associated resource consumption (e.g., API tokens).
Workflow Mechanics:
- Input Pre-processing: Original user instruction is syntactically refactored into a sequential reasoning scaffold (CoD/CoT).
- Inference Execution: The refactored prompt is dispatched to the designated LLM provider (e.g., Claude, GPT, Ollama).
- Refined Derivation: The model executes the task following the imposed structured thinking process.
- Output Normalization: The elaborated result is distilled back into a clean, final answer format.
This methodology ensures superior reasoning depth with marked efficiency gains, often achieving significant token savings relative to conventional CoT application.
Bring Your Own LLM (BYOLLM) Interoperability
This tool champions a flexible 'Bring Your Own LLM' paradigm, supporting seamless integration with diverse foundational models across cloud and local deployments.
Supported Model Endpoints
- Cloud Providers:
- Anthropic Claude series
- OpenAI GPT architecture
- Mistral AI models
- Self-Hosted/Local Inference:
- Ollama (universal model support)
- Local LLaMA derivatives
- Any model exposing a compatible chat completion API interface
Configuration Directives
- Cloud API Key Setup bash # Anthropic Credentials export ANTHROPIC_API_KEY=your_key_here
# OpenAI Credentials export OPENAI_API_KEY=your_key_here
# Mistral Credentials export MISTRAL_API_KEY=your_key_here
- Ollama Local Deployment bash # Install Ollama runtime prerequisite curl https://ollama.ai/install.sh | sh
# Acquire desired local model ollama pull mistral
# Configure tool to target Ollama service export MCP_LLM_PROVIDER=ollama export MCP_OLLAMA_MODEL=mistral
- Custom API Gateway Targeting bash # Direct tool to a custom inference endpoint export MCP_LLM_PROVIDER=custom export MCP_CUSTOM_LLM_ENDPOINT=http://localhost:your_port/api/generate
Acknowledgment and Attribution
This software implements the Chain of Draft (CoD) inference schema, originally conceptualized as an MCP utility for Claude. The foundational CoD mechanism is directly derived from the innovative work published by stat-guy. We extend profound appreciation for their seminal contributions to this streamlined reasoning methodology.
Source Repository of Origin: https://github.com/stat-guy/chain-of-draft
Principal Advantages
- Resource Optimization: Substantial reduction in token expenditure (approaching 92.4% savings versus conventional CoT).
- Latency Improvement: Expedited response times attributed to shorter cumulative generation steps.
- Operational Cost Reduction: Lower overhead associated with external LLM API interactions.
- Fidelity Preservation: Output accuracy remains comparable to, or surpasses, standard CoT evaluations.
- Versatility: Adaptable across a broad spectrum of analytical challenges and subject areas.
Core Feature Set
- CoD Synthesis Core
- Generation of highly succinct deliberation segments (often sub-five words).
- Rigid adherence to structural output protocols.
-
Precise extraction of the ultimate resolution.
-
Performance Telemetry
- Granular tracking of token consumption rates.
- Systematic measurement of solution correctness ratios.
- Chronometric logging of end-to-end execution duration.
-
Collection of domain-specific efficiency metrics.
-
Adaptive Constraint Management
- Algorithmic estimation of problem complexity.
- Dynamic tuning of maximum allowed words per step.
-
Calibration profiles tailored to specific problem domains.
-
Knowledge Repository
- Automated conversion templates from standard CoT formats to CoD.
- Indexed examples covering mathematics, software engineering, natural sciences, and logical puzzles.
-
Similarity-based retrieval mechanism for relevant instructional examples.
-
Structural Integrity Assurance
- Post-generation validation against length restrictions.
- Maintenance of the prescribed step-wise sequence.
-
Compliance auditing for generation scaffolding.
-
Reasoning Strategy Orchestration
- Automated inference method switching between CoD and CoT.
- Strategy selection optimized per problem domain.
-
Selection informed by historical comparative performance data.
-
OpenAI Protocol Adherence
- Functionality as a direct, backward-compatible substitute for standard OpenAI clients.
- Support for both legacy completions and modern chat interfaces.
- Frictionless embedding into existing software pipelines.
Deployment Prerequisites and Initialization
System Requirements
- Runtime Environment: Python version 3.10 or later (for Python build)
- Runtime Environment: Node.js version 22 or later (for JavaScript build)
- Build Toolchain: Nx (for managing unified Single Executable Application builds)
Python Environment Setup
- Obtain repository source code via cloning.
-
Install necessary dependencies: bash pip install -r requirements.txt
-
Configure access secrets within the
.envfile:
ANTHROPIC_API_KEY=your_api_key_here
- Initiate the backend service: bash python server.py
JavaScript/TypeScript Environment Setup
- Obtain repository source code via cloning.
-
Install project dependencies: bash npm install
-
Configure secrets in the
.envfile:
ANTHROPIC_API_KEY=your_api_key_here
- Execute build and run sequence: bash # Compile TypeScript sources using Nx workspace manager npm run nx build
# Launch the compiled server instance npm start
# For active development with live reloading: npm run dev
Key Nx Scripts:
- npm run nx build: Manages TypeScript to JavaScript compilation via Nx.
- npm run build:sea: Artifact generation for platform-agnostic executables.
- npm run start: Boots the production-ready server from the dist directory.
- npm test: Executes a benchmark query against the running service.
- npm run dev: Starts the TypeScript server directly using ts-node for rapid iteration.
The adoption of Nx ensures robust build lifecycle management, featuring incremental compilation, dependency visualization, and unified cross-platform packaging.
Single Executable Application (SEA) Packaging
Support is integrated for generating Single Executable Applications (SEA) leveraging Node.js 22+ and the @getlarge/nx-node-sea extension. This feature permits deployment of standalone binaries that negate the target machine's requirement for a pre-installed Node.js runtime.
SEA Build Commands
bash
Comprehensive build across supported operating systems
npm run build:sea
Targeted OS builds
npm run build:macos # For Apple Silicon/Intel Macs npm run build:linux # For Linux distributions npm run build:windows # For Windows systems
SEA Build Configuration Synopsis
The building process is centrally managed by Nx, utilizing the nx-node-sea plugin for streamlined bundling. This approach guarantees:
- Broad OS compatibility.
- Automatic inclusion of all required runtime dependencies.
- Binary size optimization.
- Zero external runtime dependencies upon execution.
Executable Deployment Context
Post-build artifacts reside within the dist folder. These resulting executables are entirely self-contained and directly runnable, preserving full application functionality.
When configuring integration with Claude Desktop, update the server configuration path to point to the newly generated binary:
{ "mcpServers": { "chain-of-draft-prompt-tool": { "command": "/path/to/mcp-chain-of-draft-prompt-tool", "env": { "ANTHROPIC_API_KEY": "your_api_key_here" } } } }
Claude Desktop Integration Protocol
To establish communication with Claude Desktop:
- Obtain the Claude Desktop client from claude.ai/download.
- Modify or create the configuration file at:
~/Library/Application Support/Claude/claude_desktop_config.json
- Specify the tool configuration (Python variant):
{ "mcpServers": { "chain-of-draft-prompt-tool": { "command": "python3", "args": ["/absolute/path/to/cod/server.py"], "env": { "ANTHROPIC_API_KEY": "your_api_key_here" } } } }
Or for the JavaScript/Node.js variant:
{ "mcpServers": { "chain-of-draft-prompt-tool": { "command": "node", "args": ["/absolute/path/to/cod/index.js"], "env": { "ANTHROPIC_API_KEY": "your_api_key_here" } } } }
- Relaunch the Claude Desktop application.
You can alternatively register the tool directly via the Claude Command Line Interface (CLI):
bash
Python implementation registration
claude mcp add chain-of-draft-prompt-tool -e ANTHROPIC_API_KEY="your_api_key_here" "python3 /absolute/path/to/cod/server.py"
JavaScript implementation registration
claude mcp add chain-of-draft-prompt-tool -e ANTHROPIC_API_KEY="your_api_key_here" "node /absolute/path/to/cod/index.js"
Interfacing via Dive GUI Host
Dive, an open-source MCP Host Desktop Application, offers an intuitive graphical interface for interacting with this utility across various LLMs (ChatGPT, Claude, Ollama, etc.).
Dive Integration Steps
- Securely install Dive from its official releases page.
- Configure the Reasoning Accelerator within Dive's MCP configuration panel:
{ "mcpServers": { "chain-of-draft-prompt-tool": { "command": "/path/to/mcp-chain-of-draft-prompt-tool", "enabled": true, "env": { "ANTHROPIC_API_KEY": "your_api_key_here" } } } }
If utilizing the source/non-SEA execution path:
{ "mcpServers": { "chain-of-draft-prompt-tool": { "command": "node", "args": ["/path/to/dist/index.js"], "enabled": true, "env": { "ANTHROPIC_API_KEY": "your_api_key_here" } } } }
Dive Interface Value Proposition
- 🌐 Unified LLM gateway with sophisticated credential management.
- 💻 Native cross-platform execution capability.
- 🔄 Fluid integration supporting both stdio and server-sent events (SSE) modes.
- 🌍 Multilingual user interface support.
- 💡 Advanced features for system prompts and custom directives.
- 📈 Automated lifecycle updates.
Leveraging Dive provides an optimized user experience for leveraging the CoD technique via the MCP framework, maximizing all inherent processing advantages.
Debugging Utility: MCP Inspector
The system facilitates integration with the MCP Inspector, a specialized visual debugging tool ideal for testing and validating the operation of MCP endpoints.
Launching the Inspector
Use the dedicated npm command to initialize the Inspector alongside the service:
bash
Launch Inspector pre-configured with the tool
npm run test-inspector
Manual initiation via CLI
npx @modelcontextprotocol/inspector -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY -- node dist/index.js
This sequence performs: 1. Background initiation of the MCP server component. 2. Automatic opening of the MCP Inspector dashboard in the default web browser. 3. Establishing connection parameters with the live server instance for interaction testing.
Inspector Interface Capabilities
The Inspector environment offers: - 🔍 Immediate visual feedback on all tool invocations and resulting outputs. - 📝 An interactive sandbox for simulating various MCP function calls. - 🔄 Comprehensive logging of all request/response payloads. - 🐛 Detailed diagnostic data for troubleshooting interaction failures. - 📊 Metrics collection for latency and throughput analysis.
This tool is indispensable for development cycles, behavior validation, input permutation testing, ensuring MCP specification compliance, and targeted performance tuning.
The Inspector interface typically materializes at http://localhost:5173.
Exposed MCP Functions
The Reasoning Accelerator service exposes the following callable operations via the Model Context Protocol:
| Function Signature | Purpose |
|---|---|
chain_of_draft_solve |
Executes a general problem resolution using CoD methodology. |
math_solve |
Specialized arithmetic problem resolution leveraging CoD. |
code_solve |
Specialized software problem resolution leveraging CoD. |
logic_solve |
Specialized logical deduction problem resolution leveraging CoD. |
get_performance_stats |
Retrieves comparative telemetry between CoD and CoT executions. |
get_token_reduction |
Fetches quantitative metrics on token savings achieved. |
analyze_problem_complexity |
Assesses input difficulty to inform constraint settings. |
Client-Side Utilization Guide
Python Client Integration Example
Direct integration into Python projects via the dedicated client module:
python from client import ChainOfDraftClient
Instantiate client, specifying the runtime environment
cod_client = ChainOfDraftClient( llm_provider="ollama", # Options: "anthropic", "openai", "mistral", "custom" model_name="llama2" # Model identifier for the chosen provider )
Initiate reasoning process
result = await cod_client.solve_with_reasoning( problem="Calculate the product: 247 times 394 equals ?", domain="math" )
print(f"Final Result: {result['final_answer']}") print(f"Derivation Trace: {result['reasoning_steps']}") print(f"Resource Cost (Tokens): {result['token_count']}")
JavaScript/TypeScript Client Integration Example
Usage within Node.js or TypeScript environments:
typescript import { ChainOfDraftClient } from './lib/chain-of-draft-client';
// Initialize client object with configuration parameters const client = new ChainOfDraftClient({ provider: 'ollama', // 'anthropic', 'openai', 'mistral', 'custom' model: 'llama2', // Model identity endpoint: 'http://localhost:11434' // Required for custom provider type });
// Execute a problem-solving routine async function processQuery() { const result = await client.solveWithReasoning({ problem: "Determine the sum: 247 + 394 = ?", domain: "math", max_words_per_step: 5 // Enforce strict brevity on steps });
console.log(Answer: ${result.final_answer});
console.log(Reasoning Path: ${result.reasoning_steps});
console.log(Token Expenditure: ${result.token_count});
}
solveMathProblem();
Underlying Component Architecture
The server component is engineered in parallel across Python and JavaScript stacks, each incorporating modular services that align functionally:
Python Implementation Modules
- AnalyticsService: Centralized logging and aggregation of performance statistics across diverse reasoning pathways and problem categories.
- ComplexityEstimator: Heuristic module for assessing input challenge level to dictate optimal inference constraints.
- ExampleDatabase: Persistent store for reasoning examples, including utilities to convert standard CoT data structures into the more compact CoD format.
- FormatEnforcer: Post-processing layer dedicated to rigorously ensuring generated steps respect defined length parameters.
- ReasoningSelector: Intelligent routing mechanism that dynamically pivots between CoD and CoT based on problem context or historical success rates.
JavaScript Implementation Modules
- analyticsDb: Volatile, in-memory data structure for live metric capturing.
- complexityEstimator: Logic unit for complexity profiling, guiding step-length configuration.
- formatEnforcer: Utility responsible for output validation against stipulated word count thresholds.
- reasoningSelector: Automated decision engine choosing the best inference strategy (CoD vs. CoT) based on contextual factors.
Both implementations strictly adhere to the same conceptual framework, ensuring functional parity for consumers utilizing the MCP interface.
Licensing
This software package is distributed under the permissive MIT License.
WIKIPEDIA CONTEXTUALIZATION: Business management tooling encompasses all mechanisms, computational frameworks, control structures, methodologies, and applications employed by entities to navigate fluctuating commercial landscapes, secure competitive advantages, and elevate operational effectiveness. These tools span departmental functions—from initial data capture and process governance to complex decision support and performance oversight. Modern business software has undergone rapid transformation, driven by technology maturation, making judicious selection and customization (rather than mere adoption) critical for sustainable organizational alignment and value generation, as demonstrated by high-utility tools like strategic planning suites, CRM systems, and supply chain optimization platforms.

