codebase-analyzer-mcp
A utility for generating comprehensive digests of source code repositories, adhering strictly to version control ignore files. It produces artifactual documentation files and facilitates interaction with advanced Language Model infrastructures via a dedicated Model Context Protocol (MCP) service endpoint.
Author

nicobailon
Quick Info
Actions
Tags
Repository Code Digest Generator (MCP Enabled)
This command-line application systematically inventories and summarizes source code assets within a specified directory structure, leveraging Gemini Flash 2.0 for analysis quality. It now incorporates native Model Context Protocol (MCP) serving capabilities for seamless integration with external AI tooling.
Core Capabilities
- Traverses code repositories recursively.
- Honors exclusion patterns defined in
.gitignore. - Explicitly bypasses boilerplate or dependency folders (e.g.,
node_modules, build artifacts). - Utilizes Gemini Flash 2.0 to distill file content into summaries.
- Persists generated digests into structured plain text artifacts.
- Offers tunable granularity and output length constraints for summaries.
- Exposes an MCP gateway for consumption by assistants like Claude Desktop and other compatible LLM clients.
- Features a modular architecture conducive to embedded deployment.
- Manages sensitive credentials securely.
- Implements endpoint authentication for the MCP listener.
- Incorporates resilient retrial logic with progressive backoff for external API calls.
- Contains built-in safeguards against service overload via rate limiting.
Prerequisites
- Requires Node.js runtime, version 18 or newer.
Setup Procedure
-
Obtain the source code repository: bash git clone https://github.com/nicobailon/code-summarizer.git cd code-summarizer
-
Install necessary runtime dependencies: bash npm install
-
Establish environment configuration via a
.envfile:
GOOGLE_API_KEY=your_secure_gemini_key_here
- Compile the project assets: bash npm run build
Initiating the MCP Service Layer
The integrated Model Context Protocol (MCP) daemon permits tools such as Cursor AI, Cline, and Claude Desktop to query file contents and pre-computed summaries directly from your local workspace.
Launching the Server Instance
bash
Activate the MCP service listener
npm start -- server
The default network socket for service provision is 24312. This can be overridden in the configuration settings:
bash
Specify an alternate service port
npm start -- config set --port 8080
Client Integration: Claude Desktop
- Ensure the analyzer MCP listener is active.
- Access settings within Claude Desktop (Claude Menu -> Settings...).
- Navigate to the 'Developer' configuration area.
- Fabricate a configuration file at
~/.claude/claude_desktop_config.json(or equivalent Windows path) with the following structure:
{ "code-summarizer": { "command": "npx", "args": ["-y", "your-path-to-code-summarizer/bin/code-summarizer.js", "server"], "env": { "GOOGLE_API_KEY": "your_api_key_here" } } }
- Relaunch Claude Desktop.
- You can now issue codebase navigation requests, e.g., "Generate abstracts for all source files in this repository."
Sample Claude Interactions: - "Provide a high-level architectural summary of the current project." - "Detail the purpose of 'src/config/config.ts'." - "Locate all code blocks associated with user session management."
Client Integration: Cursor AI
- Start the analyzer's MCP listener.
- Establish a
.cursor/mcp.jsonfile within your project's root directory:
{ "mcpServers": { "code-summarizer": { "transport": "sse", "url": "http://localhost:24312/sse", "headers": { "x-api-key": "your_api_key_here" } } } }
- Refresh Cursor or reload the workspace context.
- Query Cursor, for instance: "Can you synthesize an overview of my entire source tree?"
Sample Cursor Interactions: - "Characterize the overall project layout." - "Identify the core functional modules." - "Elucidate the internal workings of the MCP service implementation."
Client Integration: Cline
- Activate the code-summarizer MCP listener.
- Register the MCP endpoint using the CLI:
/mcp add code-summarizer http://localhost:24312/sse
- Authenticate the connection:
/mcp config code-summarizer headers.x-api-key your_api_key_here
- Request an action, e.g., "Generate summaries for every file in the repository."
Sample Cline Interactions: - "What is the responsibility of each source file?" - "Produce consolidated abstracts for all TypeScript assets." - "Detail the security credential handling procedures."
Utility of the MCP Interface
The Model Context Protocol enables LLM agents to:
- Retrieve File Abstractions: Request succinct explanations of component functions.
- Traverse Structure: Navigate the repository hierarchy programmatically.
- Bulk Analysis: Summarize collections of files concurrently.
- Specific Information Retrieval: Execute targeted searches for functionality.
- Constraint Adjustment: Dynamically modify summary depth and length.
- System State Modification: Update operational parameters via the interface.
The MCP server structures access to your codebase, enabling AI agents to introspect, navigate, and summarize code assets without manual copy-pasting.
MCP Endpoint Specification
Resource Paths
code://file/*- Direct access to contents of a singular filecode://directory/*- Enumeration of files within a folder structuresummary://file/*- Retrieval of a pre-generated file digestsummary://batch/*- Retrieval of multiple file digests simultaneously
Available Tool Methods
summarize_file: Execute summarization on one file, subject to options.summarize_directory: Process and abstract an entire directory subtree.set_config: Modify runtime settings via the protocol.
Prompt Templates Exposed
code_summary: Template for single-file textual abstraction.directory_summary: Template for repository-wide structural abstraction.
Debugging Common MCP Failures
Connection Refused
- Confirm the service is active (
npm start -- server). - Validate the configured network port.
- Investigate local firewall rules obstructing the port.
Authentication Failures
- Ensure the
x-api-keyheader contains the correct credential. - Confirm key validity and format compliance.
- Verify environment variable propagation is successful.
Transport Layer Issues
- Verify the client utilizes the expected transport method (SSE).
- Check that the URI correctly terminates at
/sse. - Test general network reachability between client and server.
Access Privileges
- Confirm the running service process possesses read permissions on the target codebase.
- Check file system permissions if specific files fail processing.
Claude Desktop Discovery Problems
- Double-check the executable path specified in
claude_desktop_config.json. - Ensure the execution arguments (
args) accurately map to the server launch command. - Review Claude Desktop logs for configuration loading errors.
Service Throttling
- If receiving "Too many requests" responses, pause client activity.
- Review and potentially modify the server's configured rate limits.
Consult server diagnostics or report an issue on the repository if problems persist.
Operational Usage
Command Line Interface (CLI)
bash
Default execution mode (invokes summarization)
npm start -- summarize [target_directory] [output_destination] [parameters]
Analyze the current working directory, writing results to summaries.txt
npm start -- summarize
Specify summary complexity and maximum character output
npm start -- summarize --detail high --max-length 1000
Access CLI help documentation
npm start -- --help
Configuration Administration
bash
Set the primary credential
npm start -- config set --api-key "a-new-secret-key"
Define default abstraction settings
npm start -- config set --detail-level high --max-length 1000
Reconfigure the MCP listener socket (default: 24312)
npm start -- config set --port 8080
Display current operational settings
npm start -- config show
Revert all settings to factory defaults
npm start -- config reset
Credential Transmission for MCP
When interfacing with the MCP service, the secret key must be conveyed in request headers:
x-api-key: your_api_key_here
Authentication is mandatory for all service pathways, excluding the dedicated /health endpoint.
Execution Parameters
--detail,-d: Defines the depth of the generated abstracts. Permitted values: 'low', 'medium', or 'high'. Default is 'medium'.--max-length,-l: Sets the ceiling for character count per summary artifact. Default is 500.
Security Posture
Credential Handling
- API secrets are prioritized from environmental variables over local configuration files.
- Keys undergo format validation prior to service initialization.
- Credentials are strictly omitted from application logs and error responses.
- Configuration files are intentionally left blank of keys when environment variables supply them.
Endpoint Authorization
- All MCP endpoints, save for service liveness checks, mandate API key verification.
- Authorization relies exclusively on the
x-api-keyHTTP header. - Unauthorized access attempts are logged for forensic review.
Traffic Management
- Integral rate limiting prevents service exploitation.
- Default policy: Sixty requests permitted per minute per originating network address.
- Thresholds are modifiable within the server configuration.
Failure Reporting
- Errors are systematically categorized and returned.
- No sensitive internal details are leaked in user-facing error messages.
- Appropriate HTTP status codes map to specific failure modes.
LLM Call Robustness
- Transitory network issues trigger automatic retries utilizing an exponentially increasing delay schedule.
- Retry parameters (maximum attempts, initial delay, backoff factor) are user-tunable.
- Timing jitter is introduced into retries to mitigate synchronized herd behavior.
- A unique request identifier facilitates end-to-end tracing of service calls.
Recognized Source Code Extensions
- TypeScript (.ts, .tsx)
- JavaScript (.js, .jsx)
- Python (.py)
- Java (.java)
- C++ (.cpp)
- C (.c)
- Go (.go)
- Ruby (.rb)
- PHP (.php)
- C# (.cs)
- Swift (.swift)
- Rust (.rs)
- Kotlin (.kt)
- Scala (.scala)
- Vue (.vue)
- HTML (.html)
- Styling: (.css, .scss, .less)
Operational Flow
- The utility scans the designated directory tree, respecting all
.gitignoredirectives. - Files are filtered based on the set of supported language extensions.
- Content is read from each compliant file, and its language is identified.
- The data is transmitted to the Gemini Flash 2.0 endpoint, accompanied by instructions detailing the required abstraction level and length constraint.
- Aggregated abstracts are compiled and written to the designated output stream/file.
Output Artifact Structure
The resultant documentation file adheres to this delimited format:
relative/path/to/file Content summary generated here.
relative/path/to/next/file Abstract for the subsequent file.
Project Layout
index.ts: Primary entry point for the Command Line Interface.src/: Primary source code locationsummarizer/: Logic underpinning the code abstraction process.mcp/: Implementation of the Model Context Protocol service layer.config/: Module responsible for configuration persistence and retrieval.bin/: Executable script wrappers.config.json: File storing application defaults.tsconfig.json: TypeScript compiler settings.package.json: Manifest detailing project metadata and scripts..env.example: Template demonstrating required environment variables..gitignore: Definitions for files and folders to exclude from processing.__tests__: Directories containing unit and end-to-end test suites.__mocks__/mock-codebase: A controlled subset of code for validation purposes.
Environment Variables Reference
The following environmental parameters influence execution:
| Variable | Purpose | Default Setting |
|---|---|---|
GOOGLE_API_KEY |
Necessary key for Google Gemini access | Undefined (Mandatory) |
PORT |
TCP port binding for the MCP listener | 24312 |
ALLOWED_ORIGINS |
Comma-separated list for CORS policy enforcement | http://localhost:3000 |
LOG_LEVEL |
Verbosity of internal system logging (error, warn, info, debug) | info |
Consult .env.example for a setup template.
Development Workflow
Executing Tests
bash
Run the complete test suite
npm test
Execute tests and generate coverage reports
npm test -- --coverage
Specific test routine for MCP service initialization
npm run test:setup
Roadmap for Future Enhancements
- Extension of supported source file extensions.
- Capability to interface with alternative AI model providers.
- Development of a graphical user interface utilizing Electron.
- Expansion of available MCP functionalities.
- Implementation of precise token consumption metrics.
- Integration of OpenTelemetry standards for system observability.
- Enhancement of audit logging capabilities.
- Introduction of automated vulnerability scanning checks.
