web-knowledge-retriever-mcp
An MCP service engineered to facilitate deep, current information retrieval from the internet via the Tavily search engine. It features robust result appraisal, data caching mechanisms, and persistent logging of search interactions, furnishing access to detailed findings including document titles, URIs, content excerpts, and synthesized AI summaries.
Author

it-beard
Quick Info
Actions
Tags
Tavily MCP Service: Advanced Web Intelligence Access
This implementation acts as a Model Context Protocol (MCP) intermediary, leveraging the Tavily API to empower AI agents with sophisticated web reconnaissance capabilities. It allows for the acquisition of timely and pertinent data from the global network.
Key Capabilities
- Sophisticated, AI-driven information sourcing
- Support for granular control over search depth (shallow vs. exhaustive)
- Delivery of comprehensive result metadata (URLs, headlines, snippet text)
- Automated summarization of discovered web content
- Systematic tracking of result quality metrics and latency profiling
- Durable persistence for all executed searches, augmented by an internal cache
- Integration with MCP abstraction layers for data plumbing
Prerequisites for Deployment
- A functional Node.js environment (v16 minimum)
- The Node Package Manager (npm)
- A valid authorization credential for the Tavily service (obtainable from Tavily's portal)
- A compliant MCP consumer application (e.g., Cline, Claude Desktop, custom tooling)
Setup Procedure
-
Acquire the source code repository: bash git clone https://github.com/it-beard/tavily-server.git cd tavily-server
-
Install necessary software components: bash npm install
-
Compile the source code: bash npm run build
Configuration Guidance
This server is designed for broad compatibility across MCP clients. Below are examples for common setups:
Configuration for Cline (VSCode Extension)
Adjust or establish the MCP configuration file at:
- macOS: ~/Library/Application Support/Cursor/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json
- Windows: %APPDATA%\Cursor\User\globalStorage\saoudrizwan.claude-dev\settings\cline_mcp_settings.json
- Linux: ~/.config/Cursor/User/globalStorage/saoudrizwan.claude-dev\settings\cline_mcp_settings.json
Inject the following structure, substituting placeholder values for execution path and secret key:
{ "mcpServers": { "tavily": { "command": "node", "args": ["/absolute/path/to/tavily-server/build/index.js"], "env": { "TAVILY_API_KEY": "your-secret-key" } } } }
Configuration for Claude Desktop Application
Modify the settings file located at:
- macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
- Windows: %APPDATA%\Claude\claude_desktop_config.json
- Linux: ~/.config/Claude/claude_desktop_config.json
Utilize the identical structure demonstrated for Cline.
Integration with Other MCP Consumers
Consult the specific client's documentation for configuration file placement. Essential elements required are:
1. The execution utility (usually node).
2. The compiled entry point's file path.
3. Necessary environment variables, crucially the Tavily access token.
Operational Usage
Available Tooling
The server exposes one primary operational command: search.
Mandatory Input
query(string): The textual assertion or question to be investigated.
Optional Input Parameters
search_depth(string): Dictates the scope; use "basic" for speed or "advanced" for thoroughness.
Execution Example
typescript // Illustrative usage via the MCP client SDK const findings = await mcpClient.callTool("tavily", "search", { query: "recent advancements in quantum computing", search_depth: "advanced" });
Data Access via Resources
The service offers both fixed and template-based data access endpoints:
Fixed (Static) Endpoints
tavily://last-search/result: Retrieves the outcome of the immediately preceding search operation.- This record is persisted locally on disk, surviving server restarts.
- Returns an informative message if no search has yet been executed.
Dynamic (Templated) Endpoints
tavily://search/{query}: Fetches results for any specified query.- The
{query}segment must be URL-encoded. - Example URI:
tavily://search/machine%20learning%20trends - Prioritizes cached data if available for the term.
- Initiates a new search and caches the result if no prior record exists.
- The output format mirrors that of the
searchtool invocation.
Resources contrast with Tools in their primary function: - Tools execute actions (e.g., launching a new web crawl). - Resources fetch stateful or cached information (e.g., viewing historical data). - Resource URIs are referenceable and storable identifiers.
Expected Response Structure (Interface Definition)
typescript interface SearchResponse { query: string; answer: string; // AI-generated synthesis results: Array<{ title: string; url: string; content: string; score: number; // Relevance metric }>; response_time: number; // Milliseconds taken }
Data Durability and Caching Strategy
The server ensures long-term retention of search history:
Storage Configuration
- The designated storage area is the local
datafolder. - Historical data and cache are serialized in
data/searches.json. - Data integrity is maintained across process shutdowns and restarts.
- The necessary directory structure is automatically provisioned upon initial execution.
Persistence Features
- Complete log of all search operations.
- In-memory and disk-based caching of all search outputs.
- Seamless, automatic serialization of new findings.
- Reliance on disk storage for non-volatile memory.
- Use of JSON serialization for transparent data inspection.
- Robust mechanisms for handling I/O failures.
- Automated tracking of the most recent successful query.
Caching Logic
- Every successful retrieval is automatically added to the cache.
- Repeat requests for identical inputs are served directly from the cache.
- Caching dramatically minimizes latency and conserves external API quotas.
- Cache state survives server lifecycle events.
Development Lifecycle
Directory Layout
tavily-server/ ├── src/ # Source code location │ └── index.ts # Core server logic ├── data/ # Persistent state repository │ └── searches.json # History and caching file ├── build/ # Output directory for compiled JavaScript ├── package.json # Project manifest and dependency list └── tsconfig.json # TypeScript compiler settings
Available Execution Commands
npm run build: Transpiles TypeScript sources into runnable JavaScript artifacts.npm run start: Launches the MCP server instance (requires prior build).npm run dev: Executes the server in a watch/development environment.
Exception Management
The service is configured to provide informative feedback for common faults: - Malformed or absent API credentials. - Issues related to network connectivity. - Input validation failures on search arguments. - Encountering external API rate limits. - Failure to locate a requested data resource. - Incorrect formatting in resource URI paths. - Errors during file system read/write operations.
Collaboration Guidelines
- Clone the official repository.
- Establish a new feature branch (e.g.,
git checkout -b enhancement/new-feature). - Commit your work (
git commit -m 'Implement feature X optimization'). - Push the branch to the remote repository (
git push origin enhancement/new-feature). - Submit a Merge/Pull Request for review.
Licensing
This software is distributed under the terms of the MIT License. Consult the accompanying LICENSE file for specifics.
Attributions
Gratitude extended to: - The Model Context Protocol (MCP) specification for defining the integration framework. - The Tavily API for powering the external search capabilities.
WIKIPEDIA NOTE: XMLHttpRequest (XHR) represents an interface, implemented as a JavaScript object, designed for dispatching HTTP requests from a client environment (browser) to a server. Its methods permit script-driven interaction with the server subsequent to initial page load, allowing for asynchronous data retrieval. XHR is foundational to the Ajax paradigm. Pre-Ajax, server synchronization primarily relied on navigation via hyperlinks or form submissions, often resulting in full-page refreshes.
== Chronology ==
The conceptual underpinning for XMLHttpRequest originated in the year 2000, conceived by Microsoft Outlook developers. This concept first materialized within the Internet Explorer 5 browser release (1999). Importantly, the initial syntax did not utilize the XMLHttpRequest identifier; rather, developers instantiated objects via ActiveXObject("Msxml2.XMLHTTP") or ActiveXObject("Microsoft.XMLHTTP"). By the time Internet Explorer 7 arrived (2006), universal support for the standardized XMLHttpRequest identifier was established across browsers.
The XMLHttpRequest identifier has since become the conventional standard utilized by all major browser engines, including Mozilla's Gecko (2002), Safari 1.2 (2004), and Opera 8.0 (2005).
=== Standardization Efforts === The World Wide Web Consortium (W3C) published its initial Working Draft specification for the XMLHttpRequest object on April 5, 2006. A Level 2 specification followed on February 25, 2008, introducing enhancements such as event progress monitoring, cross-origin access controls, and byte stream handling capabilities. By the close of 2011, the Level 2 additions were integrated back into the primary specification document. As of late 2012, responsibility for maintenance transitioned to the WHATWG, which now sustains a live document utilizing Web IDL notation.
== Implementation Pattern == The standard procedure for dispatching an asynchronous request using XMLHttpRequest involves several sequential programming steps:
- Instantiate the XMLHttpRequest agent via its constructor.
- Invoke the
openmethod to define the request methodology, target endpoint URI, and operation mode (synchronous vs. asynchronous). - If operating asynchronously, register a callback function to handle state transitions upon response arrival.
- Execute the transaction initiation by calling the
sendmethod, optionally including payload data. - Monitor state changes via the registered listener. Upon successful completion of the server transaction (state transitioning to 4, the "done" state), the retrieved data is accessible, typically in the
responseTextproperty.
Beyond these core phases, XMLHttpRequest offers extensive configuration options for controlling request behavior and response parsing. Custom HTTP headers can be prepended to direct server processing, and data can be transmitted to the server within the send argument. The resulting payload can be immediately parsed from JSON format into native JavaScript structures, or processed incrementally as data streams in. Furthermore, requests can be intentionally terminated mid-flight or constrained by a timeout duration.
== Cross-Origin Communications ==
In the nascent era of the World Wide Web, the security model inherently restricted inter-domain data fetching, which created hurdles for applications attempting to aggregate information from disparate sources, leading to the development of workarounds such as...
