logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

search-engine-semantic-service-mcp-agent

A utility designed to enable advanced conceptual searching over content stored within an Elasticsearch cluster, specifically targeting technical articles indexed from the Search Labs repository. It streamlines the process of content ingestion and retrieval through vector-based similarity matching.

Author

search-engine-semantic-service-mcp-agent logo

jedrazb

No License

Quick Info

GitHub GitHub Stars 3
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

elasticsearchblogsearchsemantic searchindexed elasticsearchelasticsearch facilitating

MCP Service Module: Elasticsearch Vector Search Orchestrator

Reference implementation for the architecture detailed here: https://j.blaszyk.me/tech-blog/mcp-server-elasticsearch-semantic-search/


System Introduction

This package furnishes a Python-based MCP server environment dedicated to performing vector-based similarity queries against the corpus of Search Labs technical publications residing in Elasticsearch.

It presupposes that the source material has already been harvested and persisted into the designated index, named search-labs-posts, utilizing the Elastic Open Crawler utility.


Deployment and Activation

Populate the .env configuration file with the connection string for Elasticsearch (ES_URL) and an access token (ES_AP_KEY). Refer to the section below detailing API key generation with minimal required scope.

To launch the service locally via the MCP control panel (Inspector):

sh make dev

The operational interface for verification will be available at: http://localhost:5173


Integration with Client Application (Claude)

To register this newly exposed service endpoint with the Claude Desktop environment:

sh make install-claude-config

This action modifies the configuration file located at ~/claude_desktop_config.json. Upon the subsequent launch of the Claude application, the semantic search capability will be recognized and made available as a tool.


Pre-indexing Phase: Content Harvesting

Phase 1: Validating the Harvesting Utility

Execute a preliminary test run of the Elastic Open Crawler configuration:

sh docker run --rm \ --entrypoint /bin/bash \ -v "$(pwd)/crawler-config:/app/config" \ --network host \ docker.elastic.co/integrations/crawler:latest \ -c "bin/crawler crawl config/test-crawler.yml"

This command should successfully output the processed content from a single test document.


Phase 2: Configuring the Data Persistence Layer

Define the necessary parameters: the Elasticsearch endpoint address and the requisite API Key.

Generate an API key granting the minimum necessary permissions for data ingestion:

sh POST /_security/api_key { "name": "crawler-search-labs", "role_descriptors": { "crawler-search-labs-role": { "cluster": ["monitor"], "indices": [ { "names": ["search-labs-posts"], "privileges": ["all"] } ] } }, "metadata": { "application": "crawler" } }

Extract the encoded credential from the returned payload and assign it to the API_KEY environment variable.


Phase 3: Schema Adaptation for Vectorization

Confirm the existence of the target index (search-labs-posts). If absent, initiate its creation:

sh PUT search-labs-posts

Apply the structural modification (mapping) required to support vector embeddings/semantic representation:

sh PUT search-labs-posts/_mappings { "properties": { "body": { "type": "text", "copy_to": "semantic_body" }, "semantic_body": { "type": "semantic_text", "inference_id": ".elser-2-elasticsearch" } } }

This setup ensures that the content of the primary body field is simultaneously vectorized using Elasticsearch’s embedded ELSER model and stored in the semantic_body field.


Phase 4: Commencing Data Ingestion

Execute the full harvesting process to populate the data store:

sh docker run --rm \ --entrypoint /bin/bash \ -v "$(pwd)/crawler-config:/app/config" \ --network host \ docker.elastic.co/integrations/crawler:latest \ -c "bin/crawler crawl config/elastic-search-labs-crawler.yml"

[!NOTE] Crucial for new deployments: Ensure the ELSER inference pipeline has fully initialized within your Elasticsearch environment prior to commencing data indexing operations.


Phase 5: Confirming Data Persistence Status

Verify that data records have successfully landed in the index:

sh GET search-labs-posts/_count

The result will yield the total tally of documents. Visual confirmation can also be obtained via the Kibana interface.


Process Complete! The system is now primed to execute vector similarity queries across the Search Labs article set.

See Also

`