local-repo-code-indexing-service
A Model Context Protocol (MCP) server designed to ingest local git repositories, segment their contents into meaningful blocks, generate vector representations (embeddings) for these segments, and facilitate advanced semantic querying across the codebase for enhanced development assistance.
Author

fkesheh
Quick Info
Actions
Tags
Local Codebase Context Engine (LCCE)
This is an MCP server dedicated to structuring and querying private, locally-stored source code from Git repositories. It bypasses external code APIs by operating directly on cloned file systems.
Core Capabilities
- Local Git Synchronization: Clones and maintains up-to-date local copies of specified Git projects.
- Code Segmentation: Analyzes source files, dividing them into semantically cohesive textual units.
- Vectorization: Computes high-dimensional embeddings for all segmented code units using an integrated Ollama instance.
- Semantic Retrieval: Enables sophisticated querying against the stored embeddings to find contextually relevant code snippets.
Architectural Highlights
- Leverages local disk storage for speed and privacy (no reliance on GitHub/GitLab APIs).
- Persistence layer built upon SQLite for metadata and embeddings indexing.
- Code chunking optimized for contextual relevance.
- Embedding generation powered by Ollama (e.g., using
unclemusclez/jina-embeddings-v2-base-code).
Prerequisites and Setup
Ensure the following runtime dependencies are satisfied:
- Node.js (version 16 or newer)
- Git command-line utility
- Ollama running locally with a suitable model installed.
Initialization Steps
bash
Obtain the server software
git clone
Install required Node packages
npm install
Compile the TypeScript/modern JavaScript source
npm run build
Configuration Directives
Environment variables govern persistent data locations:
DATA_DIR: Location for the SQLite database file. (Default:~/.lcce/data)REPO_CACHE_DIR: Root directory where Git repositories will be cloned and cached. (Default:~/.lcce/repos)
Configuring the Embedding Provider (Ollama)
Install Ollama from [https://ollama.ai/]. Then, fetch a recommended model:
bash
Download the suggested code embedding model
ollama pull unclemusclez/jina-embeddings-v2-base-code
Integration with Client Applications (e.g., Claude Desktop)
To expose this service via the MCP framework, update your client configuration (e.g., claude_desktop_config.json):
{ "mcpServers": { "local-repo-code-indexing-service": { "command": "/path/to/your/node", "args": ["/path/to/local-repo-code-indexing-service/dist/index.js"] } } }
Available Tool: codeSearch
This tool executes the full pipeline: clone/update, process files, and perform the vector search.
{ "repoLocation": "https://github.com/user/project.git", "targetBranch": "development", // Optional; defaults to the repo's HEAD branch "semanticQuery": "How is the request throttling implemented?", "tagFilters": ["security", "performance"], // Optional: Narrow results using metadata tags "fileGlobMatchers": [".py", "config//.toml"], // Optional: Limit file parsing scope via glob "fileGlobExclusions": ["/test/", "/vendor/"], // Optional: Files to ignore during indexing/search "resultCountLimit": 15 // Optional: Max context snippets to return, default is 10 }
Parameter Details:
targetBranch: If omitted, the repository's default branch pointer is utilized.tagFilters: Results are restricted to code segments associated with these specified keywords (case-insensitive checks).fileGlobMatchers/fileGlobExclusions: Glob patterns control which source files are included or discarded during the indexing phase.
Internal Data Structure
The service maintains relational data within its SQLite backend across several interconnected tables:
repositories: Metadata for each tracked Git project.branches: Records specific branch states.files: Inventory of source files discovered.branch_file_association: Mapping linking files to branches.file_chunk: The core table holding textual code segments and their associated high-dimensional vector embeddings.
Troubleshooting Notes
ARM Architecture (Apple Silicon) SQLite Binding Errors
If better-sqlite3 compilation fails on M-series Macs with errors referencing incompatible architectures (e.g., finding x86_64 when arm64e is required), this indicates an architecture mismatch in the Node environment.
-
Verify the running Node version's architecture: bash node -p "process.arch"
-
If mismatch persists, force a rebuild for the correct architecture: bash npm rebuild better-sqlite3 --build-from-source
-
For a complete reset, clear and reinstall: bash npm uninstall better-sqlite3 export npm_config_arch=arm64 export npm_config_target_arch=arm64 npm install better-sqlite3 --build-from-source
To ensure consistency across shell sessions, add the architecture exports to your shell initialization file (.zshrc or .bashrc).
Verifying Ollama Endpoint Connectivity
Test the embedding API directly:
bash curl http://localhost:11434/api/embed -d '{"model":"unclemusclez/jina-embeddings-v2-base-code","input":"Verifying connectivity for vector generation."}'
Licensing
This software is distributed under the MIT License.
