knowledge-sync-nexus-mcp
A standardized interface for orchestrating data acquisition and modification across diverse information repositories. Enables advanced semantic retrieval operations and controls connectivity to external platforms within a unified knowledge orchestration framework.
Author

scmdr
Quick Info
Actions
Tags
KnowledgeSync Nexus MCP Adapter
This implements the Model Context Protocol (MCP) server for the underlying SourceSync.ai service. It furnishes AI agents with a consistent mechanism to interface with the SourceSync.ai knowledge management ecosystem.
Core Capabilities
- Organize data assets via dedicated logical partitions (namespaces).
- Incorporate data from varied origins (e.g., raw text, web links, external vendor systems).
- Facilitate the retrieval, modification, and lifecycle management of data entities within the centralized knowledge store.
- Execute sophisticated content discovery via vector-based (semantic) and combined (hybrid) search methodologies.
- Directly access textual representations derived from parsed Uniform Resource Locators (URLs).
- Administer configurations for external system integrations.
- Provide inherent default settings optimized for immediate AI integration.
Deployment Procedures
Execution via npx
bash
Setup and launch, supplying requisite API credentials and tenant identifier
env SOURCESYNC_API_KEY=your_api_key npx -y sourcesyncai-mcp
Installation via Smithery
For automated setup within Claude Desktop environments leveraging Smithery:
bash npx -y @smithery/cli install @pbteja1998/sourcesyncai-mcp --client claude
Manual Compilation
bash
Clone the repository source code
git clone https://github.com/yourusername/sourcesyncai-mcp.git cd sourcesyncai-mcp
Install necessary dependencies
npm install
Compile the source code
npm run build
Start the server instance
node dist/index.js
Configuration in Cursor IDE
To integrate the KnowledgeSync Nexus MCP adapter within the Cursor IDE:
- Access Cursor Preferences/Settings.
- Navigate to
Features > MCP Servers. - Select
+ Add New MCP Server. - Populate the fields:
- Name:
knowledge-sync-nexus-mcp(or a preferred label) - Type:
command - Command:
env SOURCESYNCAI_API_KEY=your-api-key npx -y sourcesyncai-mcp
Once registered, leverage the capabilities of SourceSync.ai by articulating your knowledge management requirements to Cursor's reasoning engine.
Windsurf Integration
Modify your ./codeium/windsurf/model_config.json file as follows:
{ "mcpServers": { "knowledge-sync-nexus-mcp": { "command": "npx", "args": ["-y", "soucesyncai-mcp"], "env": { "SOURCESYNC_API_KEY": "your_api_key", "SOURCESYNC_NAMESPACE_ID": "your_namespace_id", "SOURCESYNC_TENANT_ID": "your_tenant_id" } } } }
Claude Desktop Configuration
To enable this MCP endpoint in Claude Desktop:
-
Identify the location of your Claude Desktop configuration file:
-
macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json -
Linux:
~/.config/Claude/claude_desktop_config.json -
Append the SourceSync.ai MCP server definition to this JSON file:
{ "mcpServers": { "knowledge-sync-nexus-mcp": { "command": "npx", "args": ["-y", "sourcesyncai-mcp"], "env": { "SOURCESYNC_API_KEY": "your_api_key", "SOURCESYNC_NAMESPACE_ID": "your_namespace_id", "SOURCESYNC_TENANT_ID": "your_tenant_id" } } } }
- Persist the configuration changes and relaunch Claude Desktop.
Configuration Parameters
Mandatory Environment Variables
SOURCESYNC_API_KEY: The requisite authentication token for SourceSync.ai access.
Optional Environment Variables
SOURCESYNC_NAMESPACE_ID: The default container context for operational tasks.SOURCESYNC_TENANT_ID: Your designated organizational identifier.
Configuration Initialization Examples
Setting up default operational parameters:
bash export SOURCESYNC_API_KEY=your_api_key export SOURCESYNC_TENANT_ID=your_tenant_id export SOURCESYNC_NAMESPACE_ID=your_namespace_id
Available Service Functions (Tools)
Authorization
validate_api_key: Verifies the validity of a provided SourceSync.ai credential.
{ "name": "validate_api_key", "arguments": {} }
Logical Partitions (Namespaces)
create_namespace: Establishes a new knowledge partition.list_namespaces: Retrieves a manifest of all existing partitions.get_namespace: Fetches detailed configuration for a specified partition.update_namespace: Modifies the settings of an existing partition.delete_namespace: Permanently removes a knowledge partition.
{ "name": "create_namespace", "arguments": { "name": "project-archive-v2", "fileStorageConfig": { "provider": "S3_COMPATIBLE", "config": { "endpoint": "s3.amazonaws.com", "accessKey": "your_access_key", "secretKey": "your_secret_key", "bucket": "your_bucket", "region": "us-east-1" } }, "vectorStorageConfig": { "provider": "PINECONE", "config": { "apiKey": "your_pinecone_api_key", "environment": "your_environment", "index": "your_index" } }, "embeddingModelConfig": { "provider": "OPENAI", "config": { "apiKey": "your_openai_api_key", "model": "text-embedding-3-small" } }, "tenantId": "tenant_XXX" } }
{ "name": "list_namespaces", "arguments": { "tenantId": "tenant_XXX" } }
{ "name": "get_namespace", "arguments": { "namespaceId": "namespace_XXX", "tenantId": "tenant_XXX" } }
{ "name": "update_namespace", "arguments": { "namespaceId": "namespace_XXX", "tenantId": "tenant_XXX", "name": "renamed-partition-context" } }
{ "name": "delete_namespace", "arguments": { "namespaceId": "namespace_XXX", "tenantId": "tenant_XXX" } }
Data Import Operations
ingest_text: Introduce data from raw textual input.ingest_urls: Batch-import content identified by a list of web addresses.ingest_sitemap: Import content indexed via a site map file.ingest_website: Crawl and import content from a specified web domain.ingest_notion: Integrate data from a configured Notion workspace.ingest_google_drive: Synchronize data from Google Drive via a connection.ingest_dropbox: Synchronize data from Dropbox via a connection.ingest_onedrive: Synchronize data from Microsoft OneDrive via a connection.ingest_box: Synchronize data from Box storage via a connection.get_ingest_job_run_status: Query the outcome status of a background data loading task.
{ "name": "ingest_text", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "TEXT", "config": { "name": "sample-text-asset", "text": "This constitutes the primary data payload for incorporation.", "metadata": { "type": "example", "author": "AssistantAgent" } } }, "tenantId": "tenant_XXX" } }
{ "name": "ingest_urls", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "URLS", "config": { "urls": ["https://reference.site/docA", "https://reference.site/docB"], "metadata": { "origin": "web_crawl", "classification": "guidance" } } }, "tenantId": "tenant_XXX" } }
{ "name": "ingest_sitemap", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "SITEMAP", "config": { "url": "https://corporate.org/sitemap.xml", "metadata": { "origin": "sitemap_feed", "domain": "corporate.org" } } }, "tenantId": "tenant_XXX" } }
{ "name": "ingest_website", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "WEBSITE", "config": { "url": "https://public-docs.io", "maxDepth": 4, "maxPages": 150, "metadata": { "origin": "website_scrape", "domain": "public-docs.io" } } }, "tenantId": "tenant_XXX" } }
{ "name": "ingest_notion", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "NOTION", "config": { "connectionId": "notion_link_identifier", "metadata": { "origin": "notion_workspace", "workspace": "Engineering Team Files" } } }, "tenantId": "your_tenant_id" } }
{ "name": "ingest_google_drive", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "GOOGLE_DRIVE", "config": { "connectionId": "gdrive_conn_id_456", "metadata": { "origin": "gdrive", "owner": "devops@enterprise.net" } } }, "tenantId": "tenant_XXX" } }
{ "name": "ingest_dropbox", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "DROPBOX", "config": { "connectionId": "dropbox_conn_id_789", "metadata": { "origin": "dropbox", "account": "marketing@enterprise.net" } } }, "tenantId": "tenant_XXX" } }
{ "name": "ingest_onedrive", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "ONEDRIVE", "config": { "connectionId": "onedrive_conn_id_101", "metadata": { "origin": "onedrive", "account": "finance@enterprise.net" } } }, "tenantId": "tenant_XXX" } }
{ "name": "ingest_box", "arguments": { "namespaceId": "your_namespace_id", "ingestConfig": { "source": "BOX", "config": { "connectionId": "box_conn_id_202", "metadata": { "origin": "box", "owner": "hr@enterprise.net" } } }, "tenantId": "tenant_XXX" } }
{ "name": "get_ingest_job_run_status", "arguments": { "namespaceId": "your_namespace_id", "ingestJobRunId": "job_run_ABC123", "tenantId": "tenant_XXX" } }
Data Retrieval & Querying
getDocuments: Fetch data entities based on defined criteria (filters).updateDocuments: Apply modifications to the metadata attributes of existing data entities.deleteDocuments: Erase specified data entities.resyncDocuments: Initiate a re-indexing or update synchronization for selected entities.fetchUrlContent: Retrieve the raw textual payload pointed to by a document's URL reference.
{ "name": "getDocuments", "arguments": { "namespaceId": "partition_ref_ABC", "tenantId": "tenant_XXX", "filterConfig": { "documentTypes": ["PDF", "DOCX"] }, "includeConfig": { "parsedTextFileUrl": true } } }
{ "name": "updateDocuments", "arguments": { "namespaceId": "partition_ref_ABC", "tenantId": "tenant_XXX", "documentIds": ["entity_001", "entity_002"], "filterConfig": { "documentIds": ["entity_001", "entity_002"] }, "data": { "metadata": { "validation_state": "verified", "department": "R&D" } } } }
{ "name": "deleteDocuments", "arguments": { "namespaceId": "partition_ref_ABC", "tenantId": "tenant_XXX", "documentIds": ["entity_001", "entity_002"], "filterConfig": { "documentIds": ["entity_001", "entity_002"] } } }
{ "name": "resyncDocuments", "arguments": { "namespaceId": "partition_ref_ABC", "tenantId": "tenant_XXX", "documentIds": ["entity_001", "entity_002"], "filterConfig": { "documentIds": ["entity_001", "entity_002"] } } }
{ "name": "fetchUrlContent", "arguments": { "url": "https://api.sourcesync.ai/v1/assets/entity_001/content?format=text", "apiKey": "provided_key", "tenantId": "tenant_XXX" } }
Information Discovery (Search)
semantic_search: Execute a conceptual similarity query against the knowledge index.hybrid_search: Execute a combined search prioritizing both conceptual similarity and keyword matching.
{ "name": "semantic_search", "arguments": { "namespaceId": "knowledge_corpus_XYZ", "query": "advancements in neural network architectures", "topK": 10, "tenantId": "tenant_XXX" } }
{ "name": "hybrid_search", "arguments": { "namespaceId": "knowledge_corpus_XYZ", "query": "financial reports Q4 2023", "topK": 7, "tenantId": "tenant_XXX", "hybridConfig": { "semanticWeight": 0.85, "keywordWeight": 0.15 } } }
External Service Linkages (Connections)
create_connection: Establish a new linkage to a third-party platform.list_connections: Generate a directory of existing configured linkages.get_connection: Retrieve specific credentials and settings for a linkage.update_connection: Modify parameters for an active linkage.revoke_connection: Deactivate and remove a configured linkage.
{ "name": "create_connection", "arguments": { "tenantId": "tenant_XXX", "namespaceId": "namespace_XXX", "name": "Team Dropbox Link", "connector": "DROPBOX", "clientRedirectUrl": "https://my-app.corp/oauth/finish" } }
{ "name": "list_connections", "arguments": { "tenantId": "tenant_XXX", "namespaceId": "namespace_XXX" } }
{ "name": "get_connection", "arguments": { "tenantId": "tenant_XXX", "namespaceId": "namespace_XXX", "connectionId": "link_ID_456" } }
{ "name": "update_connection", "arguments": { "tenantId": "tenant_XXX", "namespaceId": "namespace_XXX", "connectionId": "link_ID_456", "name": "Updated Dropbox Link Name", "clientRedirectUrl": "https://new-app.corp/oauth/complete" } }
{ "name": "revoke_connection", "arguments": { "tenantId": "tenant_XXX", "namespaceId": "namespace_XXX", "connectionId": "link_ID_456" } }
Illustrative User Instructions
Following successful deployment of this MCP adapter, use directives such as:
- "Query the SourceSync repository for all material pertaining to quantum entanglement theory."
- "Incorporate the contents of this external webpage into my primary knowledge partition: [External Web Link]"
- "Provision a new logical space named 'Compliance Audits' within the SourceSync environment."
- "Generate a roster of all indexed items residing in the 'R&D' namespace."
- "Retrieve the fully parsed textual data associated with entity identifier [entity_id] from my current SourceSync context."
Diagnostic and Remediation Guide
Connectivity Difficulties
If linkage failures are observed:
- Path Verification: Confirm that all specified file system references employ absolute addressing rather than relative context.
- Execution Rights: Ensure the server module possesses executable permissions (
chmod +x dist/index.js). - Client Debug Mode: For Claude Desktop users, activate Developer Mode and examine the MCP activity log.
- Direct Execution Test: Manually launch the server from the terminal:
bash node /full/path/to/sourcesyncai-mcp/dist/index.js
- Client Restart: Always perform a complete termination and restart of the AI client application (Claude/Cursor) post-configuration adjustments.
- Environment Variable Check: Validate that all mandatory system variables are populated accurately.
Verbose Logging
To activate detailed operational logging, set the DEBUG environment flag:
Project Engineering
Directory Organization
src/index.ts: Primary execution entry point and server initialization logic.src/schemas.ts: Definitions for all function argument structures.src/sourcesync.ts: Core client module interfacing with the SourceSync.ai API layer.src/sourcesync.types.ts: Static type definitions for enhanced code safety.
Build and Validation
bash
Compile the application
npm run build
Execute unit and integration tests
npm test
Licensing
This project is distributed under the MIT License.
Useful References
- SourceSync.ai Official Documentation
- SourceSync.ai API Specification
- Model Context Protocol Specification
Content Extraction Protocol Flow:
- Initially, invoke
getDocuments, ensuringincludeConfig.parsedTextFileUrlis set totrueto obtain document references including content URLs. - Isolate the content retrieval URL from the returned document response object.
- Subsequently, call
fetchUrlContentutilizing the extracted URL to pull the primary textual payload:
{ "name": "fetchUrlContent", "arguments": { "url": "https://example.com/retrieved/content" } }
WIKIPEDIA_NOTE: XMLHttpRequest (XHR) represents an application programming interface embodied as a JavaScript object, designed to facilitate HTTP requests from a client browser to a server. This methodology permits browser-based applications to initiate server communication post-page load and subsequently process incoming data. XHR forms a fundamental element of Asynchronous JavaScript and XML (Ajax) techniques. Before Ajax, the main methods for server interaction relied heavily on standard hyperlink navigations and form submissions, frequently resulting in full page replacement.
== Genesis ==
The conceptual foundation for XMLHttpRequest was established in 2000 by the development team at Microsoft Outlook. This concept was first realized within the Internet Explorer 5 browser iteration (released 1999). However, the initial syntax did not formally adopt the XMLHttpRequest identifier. Instead, proprietary identifiers like ActiveXObject("Msxml2.XMLHTTP") and ActiveXObject("Microsoft.XMLHTTP") were utilized. As of Internet Explorer 7 (2006), comprehensive cross-browser support for the standardized XMLHttpRequest identifier was achieved across all major platforms.
The XMLHttpRequest identifier has since become the established industry benchmark across all major browser rendering engines, including Mozilla's Gecko (adopted 2002), Apple's Safari 1.2 (2004), and Opera 8.0 (2005).
=== Formalization === The World Wide Web Consortium (W3C) formalized the object specification with a Working Draft on April 5, 2006. A subsequent Level 2 specification draft emerged on February 25, 2008, introducing enhanced features such as progress event monitoring, capabilities for cross-site data exchange, and facilities for managing raw byte streams. By the close of 2011, the Level 2 additions were merged back into the primary specification document. Since late 2012, the maintenance and evolution of the standard have been transferred to the WHATWG, which currently sustains a living document defined using Web IDL.
== Operational Use == Typically, executing a request using XMLHttpRequest involves sequential programming stages.
- Instantiate an XMLHttpRequest object by invoking its designated constructor:
- Invoke the "open" method to define the transmission type (e.g., GET, POST), specify the target resource endpoint, and designate the execution mode as synchronous or asynchronous:
- For asynchronous transmissions, define an event handler listener responsible for reacting to changes in the request's operational state:
- Commence the data transfer process by executing the "send" method:
- Monitor state transitions via the established event listener. If the server successfully returns data, this information is typically resident in the
responseTextproperty by default. The object signifies completion when its state transitions to 4 (the "done" status). Beyond these fundamental operations, XMLHttpRequest offers extensive controls over request construction and response parsing. Custom header fields can be injected to instruct the server on fulfillment requirements, and data payloads can be uploaded via arguments passed to the "send" call. Responses can be automatically deserialized from JSON into native JavaScript objects or processed incrementally as data streams arrive, avoiding blockage while awaiting the full transmission. Furthermore, requests can be preemptively terminated or configured with a timeout threshold.
== Cross-Origin Communication ==
In the nascent phase of the World Wide Web, limitations were identified that permitted unintended data exposure betw
