logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

codebase-analyzer-mcp

A utility for generating comprehensive digests of source code repositories, adhering strictly to version control ignore files. It produces artifactual documentation files and facilitates interaction with advanced Language Model infrastructures via a dedicated Model Context Protocol (MCP) service endpoint.

Author

codebase-analyzer-mcp logo

nicobailon

No License

Quick Info

GitHub GitHub Stars 2
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

summarizersummarizesgitignoresummarizes codecode summarizersummarizer summarizes

Repository Code Digest Generator (MCP Enabled)

This command-line application systematically inventories and summarizes source code assets within a specified directory structure, leveraging Gemini Flash 2.0 for analysis quality. It now incorporates native Model Context Protocol (MCP) serving capabilities for seamless integration with external AI tooling.

Core Capabilities

  • Traverses code repositories recursively.
  • Honors exclusion patterns defined in .gitignore.
  • Explicitly bypasses boilerplate or dependency folders (e.g., node_modules, build artifacts).
  • Utilizes Gemini Flash 2.0 to distill file content into summaries.
  • Persists generated digests into structured plain text artifacts.
  • Offers tunable granularity and output length constraints for summaries.
  • Exposes an MCP gateway for consumption by assistants like Claude Desktop and other compatible LLM clients.
  • Features a modular architecture conducive to embedded deployment.
  • Manages sensitive credentials securely.
  • Implements endpoint authentication for the MCP listener.
  • Incorporates resilient retrial logic with progressive backoff for external API calls.
  • Contains built-in safeguards against service overload via rate limiting.

Prerequisites

  • Requires Node.js runtime, version 18 or newer.

Setup Procedure

  1. Obtain the source code repository: bash git clone https://github.com/nicobailon/code-summarizer.git cd code-summarizer

  2. Install necessary runtime dependencies: bash npm install

  3. Establish environment configuration via a .env file:

GOOGLE_API_KEY=your_secure_gemini_key_here

  1. Compile the project assets: bash npm run build

Initiating the MCP Service Layer

The integrated Model Context Protocol (MCP) daemon permits tools such as Cursor AI, Cline, and Claude Desktop to query file contents and pre-computed summaries directly from your local workspace.

Launching the Server Instance

bash

Activate the MCP service listener

npm start -- server

The default network socket for service provision is 24312. This can be overridden in the configuration settings:

bash

Specify an alternate service port

npm start -- config set --port 8080

Client Integration: Claude Desktop

  1. Ensure the analyzer MCP listener is active.
  2. Access settings within Claude Desktop (Claude Menu -> Settings...).
  3. Navigate to the 'Developer' configuration area.
  4. Fabricate a configuration file at ~/.claude/claude_desktop_config.json (or equivalent Windows path) with the following structure:

{ "code-summarizer": { "command": "npx", "args": ["-y", "your-path-to-code-summarizer/bin/code-summarizer.js", "server"], "env": { "GOOGLE_API_KEY": "your_api_key_here" } } }

  1. Relaunch Claude Desktop.
  2. You can now issue codebase navigation requests, e.g., "Generate abstracts for all source files in this repository."

Sample Claude Interactions: - "Provide a high-level architectural summary of the current project." - "Detail the purpose of 'src/config/config.ts'." - "Locate all code blocks associated with user session management."

Client Integration: Cursor AI

  1. Start the analyzer's MCP listener.
  2. Establish a .cursor/mcp.json file within your project's root directory:

{ "mcpServers": { "code-summarizer": { "transport": "sse", "url": "http://localhost:24312/sse", "headers": { "x-api-key": "your_api_key_here" } } } }

  1. Refresh Cursor or reload the workspace context.
  2. Query Cursor, for instance: "Can you synthesize an overview of my entire source tree?"

Sample Cursor Interactions: - "Characterize the overall project layout." - "Identify the core functional modules." - "Elucidate the internal workings of the MCP service implementation."

Client Integration: Cline

  1. Activate the code-summarizer MCP listener.
  2. Register the MCP endpoint using the CLI:

/mcp add code-summarizer http://localhost:24312/sse

  1. Authenticate the connection:

/mcp config code-summarizer headers.x-api-key your_api_key_here

  1. Request an action, e.g., "Generate summaries for every file in the repository."

Sample Cline Interactions: - "What is the responsibility of each source file?" - "Produce consolidated abstracts for all TypeScript assets." - "Detail the security credential handling procedures."

Utility of the MCP Interface

The Model Context Protocol enables LLM agents to:

  1. Retrieve File Abstractions: Request succinct explanations of component functions.
  2. Traverse Structure: Navigate the repository hierarchy programmatically.
  3. Bulk Analysis: Summarize collections of files concurrently.
  4. Specific Information Retrieval: Execute targeted searches for functionality.
  5. Constraint Adjustment: Dynamically modify summary depth and length.
  6. System State Modification: Update operational parameters via the interface.

The MCP server structures access to your codebase, enabling AI agents to introspect, navigate, and summarize code assets without manual copy-pasting.

MCP Endpoint Specification

Resource Paths

  • code://file/* - Direct access to contents of a singular file
  • code://directory/* - Enumeration of files within a folder structure
  • summary://file/* - Retrieval of a pre-generated file digest
  • summary://batch/* - Retrieval of multiple file digests simultaneously

Available Tool Methods

  • summarize_file: Execute summarization on one file, subject to options.
  • summarize_directory: Process and abstract an entire directory subtree.
  • set_config: Modify runtime settings via the protocol.

Prompt Templates Exposed

  • code_summary: Template for single-file textual abstraction.
  • directory_summary: Template for repository-wide structural abstraction.

Debugging Common MCP Failures

Connection Refused

  1. Confirm the service is active (npm start -- server).
  2. Validate the configured network port.
  3. Investigate local firewall rules obstructing the port.

Authentication Failures

  1. Ensure the x-api-key header contains the correct credential.
  2. Confirm key validity and format compliance.
  3. Verify environment variable propagation is successful.

Transport Layer Issues

  1. Verify the client utilizes the expected transport method (SSE).
  2. Check that the URI correctly terminates at /sse.
  3. Test general network reachability between client and server.

Access Privileges

  1. Confirm the running service process possesses read permissions on the target codebase.
  2. Check file system permissions if specific files fail processing.

Claude Desktop Discovery Problems

  1. Double-check the executable path specified in claude_desktop_config.json.
  2. Ensure the execution arguments (args) accurately map to the server launch command.
  3. Review Claude Desktop logs for configuration loading errors.

Service Throttling

  1. If receiving "Too many requests" responses, pause client activity.
  2. Review and potentially modify the server's configured rate limits.

Consult server diagnostics or report an issue on the repository if problems persist.

Operational Usage

Command Line Interface (CLI)

bash

Default execution mode (invokes summarization)

npm start -- summarize [target_directory] [output_destination] [parameters]

Analyze the current working directory, writing results to summaries.txt

npm start -- summarize

Specify summary complexity and maximum character output

npm start -- summarize --detail high --max-length 1000

Access CLI help documentation

npm start -- --help

Configuration Administration

bash

Set the primary credential

npm start -- config set --api-key "a-new-secret-key"

Define default abstraction settings

npm start -- config set --detail-level high --max-length 1000

Reconfigure the MCP listener socket (default: 24312)

npm start -- config set --port 8080

Display current operational settings

npm start -- config show

Revert all settings to factory defaults

npm start -- config reset

Credential Transmission for MCP

When interfacing with the MCP service, the secret key must be conveyed in request headers:

x-api-key: your_api_key_here

Authentication is mandatory for all service pathways, excluding the dedicated /health endpoint.

Execution Parameters

  • --detail, -d: Defines the depth of the generated abstracts. Permitted values: 'low', 'medium', or 'high'. Default is 'medium'.
  • --max-length, -l: Sets the ceiling for character count per summary artifact. Default is 500.

Security Posture

Credential Handling

  • API secrets are prioritized from environmental variables over local configuration files.
  • Keys undergo format validation prior to service initialization.
  • Credentials are strictly omitted from application logs and error responses.
  • Configuration files are intentionally left blank of keys when environment variables supply them.

Endpoint Authorization

  • All MCP endpoints, save for service liveness checks, mandate API key verification.
  • Authorization relies exclusively on the x-api-key HTTP header.
  • Unauthorized access attempts are logged for forensic review.

Traffic Management

  • Integral rate limiting prevents service exploitation.
  • Default policy: Sixty requests permitted per minute per originating network address.
  • Thresholds are modifiable within the server configuration.

Failure Reporting

  • Errors are systematically categorized and returned.
  • No sensitive internal details are leaked in user-facing error messages.
  • Appropriate HTTP status codes map to specific failure modes.

LLM Call Robustness

  • Transitory network issues trigger automatic retries utilizing an exponentially increasing delay schedule.
  • Retry parameters (maximum attempts, initial delay, backoff factor) are user-tunable.
  • Timing jitter is introduced into retries to mitigate synchronized herd behavior.
  • A unique request identifier facilitates end-to-end tracing of service calls.

Recognized Source Code Extensions

  • TypeScript (.ts, .tsx)
  • JavaScript (.js, .jsx)
  • Python (.py)
  • Java (.java)
  • C++ (.cpp)
  • C (.c)
  • Go (.go)
  • Ruby (.rb)
  • PHP (.php)
  • C# (.cs)
  • Swift (.swift)
  • Rust (.rs)
  • Kotlin (.kt)
  • Scala (.scala)
  • Vue (.vue)
  • HTML (.html)
  • Styling: (.css, .scss, .less)

Operational Flow

  1. The utility scans the designated directory tree, respecting all .gitignore directives.
  2. Files are filtered based on the set of supported language extensions.
  3. Content is read from each compliant file, and its language is identified.
  4. The data is transmitted to the Gemini Flash 2.0 endpoint, accompanied by instructions detailing the required abstraction level and length constraint.
  5. Aggregated abstracts are compiled and written to the designated output stream/file.

Output Artifact Structure

The resultant documentation file adheres to this delimited format:

relative/path/to/file Content summary generated here.

relative/path/to/next/file Abstract for the subsequent file.

Project Layout

  • index.ts: Primary entry point for the Command Line Interface.
  • src/: Primary source code location
  • summarizer/: Logic underpinning the code abstraction process.
  • mcp/: Implementation of the Model Context Protocol service layer.
  • config/: Module responsible for configuration persistence and retrieval.
  • bin/: Executable script wrappers.
  • config.json: File storing application defaults.
  • tsconfig.json: TypeScript compiler settings.
  • package.json: Manifest detailing project metadata and scripts.
  • .env.example: Template demonstrating required environment variables.
  • .gitignore: Definitions for files and folders to exclude from processing.
  • __tests__: Directories containing unit and end-to-end test suites.
  • __mocks__/mock-codebase: A controlled subset of code for validation purposes.

Environment Variables Reference

The following environmental parameters influence execution:

Variable Purpose Default Setting
GOOGLE_API_KEY Necessary key for Google Gemini access Undefined (Mandatory)
PORT TCP port binding for the MCP listener 24312
ALLOWED_ORIGINS Comma-separated list for CORS policy enforcement http://localhost:3000
LOG_LEVEL Verbosity of internal system logging (error, warn, info, debug) info

Consult .env.example for a setup template.

Development Workflow

Executing Tests

bash

Run the complete test suite

npm test

Execute tests and generate coverage reports

npm test -- --coverage

Specific test routine for MCP service initialization

npm run test:setup

Roadmap for Future Enhancements

  • Extension of supported source file extensions.
  • Capability to interface with alternative AI model providers.
  • Development of a graphical user interface utilizing Electron.
  • Expansion of available MCP functionalities.
  • Implementation of precise token consumption metrics.
  • Integration of OpenTelemetry standards for system observability.
  • Enhancement of audit logging capabilities.
  • Introduction of automated vulnerability scanning checks.

See Also

`