logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

keboola-mcp-gateway

Facilitate programmatic interaction with Keboola Connection data assets, including table metadata management, execution of complex SQL constructs, and efficient data exportation for augmented analytical pipelines leveraging diverse generative artificial intelligence entities.

Author

keboola-mcp-gateway logo

keboola

MIT License

Quick Info

GitHub GitHub Stars 79
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

keboolaapisanalyticsrequests keboolaaccess keboolakeboola mcp

DeepWiki Reference

Keboola Machine Communication Protocol Server

Establish a robust conduit between your Keboola environment and sophisticated AI agents/clients (e.g., Cursor, Claude, Windsurf, VS Code). Expose critical functionalities like underlying data structures, procedural SQL operations, and asynchronous job initiation without requiring intermediary integration code. Ensure precise data delivery to assistants exactly when and where required.

High-Level Summary

Keboola MCP Server functions as an open-source intermediary layer connecting your Keboola workspace with contemporary artificial intelligence frameworks. It transforms native Keboola capabilities—such as storage access control, in-database SQL processing, and job orchestration—into callable functions for platforms like Claude, Cursor, CrewAI, LangChain, Amazon Q, and others.

Core Capabilities

Leveraging the AI Agent and MCP Server synergy permits the following actions:

  • Data Repository: Directly interrogate data structures and manipulate documentation pertaining to tables or storage containers.
  • Configuration Artifacts: Instantiate, enumerate, and examine definitions for extractors, writers, data applications, and transformation blueprints.
  • SQL Execution: Generate intricate SQL transformations using conversational natural language input.
  • Job Orchestration: Initiate component and transformation processes, retrieving detailed outcomes of execution records.
  • Workflow Management: Define and govern sequential pipelines utilizing Conditional Flow mechanisms and Orchestrator Flows.
  • Data Applications: Provision, deploy, and govern Keboola Streamlit Data Applications that visualize storage data via custom queries.
  • Metadata Manipulation: Perform searches, retrievals, and modifications on project documentation and object attributes via semantic queries.
  • Development Sandboxing: Safely iterate within isolated development branches, constraining all operations to the selected branch context.

🚀 Remote Access Initialization (Fastest Path)

To expedite deployment, utilize the Hosted MCP Server option. This managed service removes all local installation, configuration, or environment setup burdens.

What is the Hosted Server?

This server resides within every multi-tenant Keboola instance and utilizes OAuth for secure identity verification. It is connectable from any AI assistant supporting remote Server-Sent Events (SSE) communication and OAuth authentication.

Connection Procedure

  1. Acquire Server Endpoint: Navigate to Project Settings → MCP Server section in your Keboola instance.
  2. Capture URL: Copy the endpoint, typically formatted as https://mcp.<YOUR_REGION>.keboola.com/sse.
  3. Configure Agent: Input this endpoint address into your AI assistant's designated MCP settings.
  4. Authorization: You will be redirected to authenticate with your Keboola credentials and select the target project.

Compatible Agents

  • Cursor: Integration via the "Install In Cursor" prompt in your project's MCP Server settings or dedicated deep link.
  • Claude Desktop: Integrate through Settings → Integrations menu.
  • Windsurf: Setup requires inputting the remote endpoint URL.
  • Make: Integration configured using the remote server URL.
  • Other MCP Interfaces: Configure using the provided remote endpoint address.

For granular setup instructions and region-specific URLs, consult the Remote Server Setup documentation.

Utilizing Dev Branches

Development work can be securely conducted within Keboola development branches without impacting production assets. Remote MCP Servers honor the KBC_BRANCH_ID parameter, confining all actions to the specified branch. The branch identifier is discoverable in the UI URL during branch navigation (e.g., .../admin/projects/PROJECT_ID/branch/BRANCH_ID/dashboard). This ID must be transmitted in the header X-Branch-Id: <branchId> for every request; otherwise, the production branch is used by default. This scoping should ideally be managed by the connecting AI client.


Local Environment Configuration (Custom or Development Instances)

Execute the MCP server directly on your local machine to gain comprehensive operational oversight and facilitate rapid development cycles. Select this option when customization, local debugging, or swift iteration is paramount. This involves cloning the source repository, providing Keboola access credentials via environment variables or request headers (depending on the chosen communication method), installing prerequisites, and initiating the service. This path grants maximum adaptability (custom tooling, local diagnostics, offline iteration) but mandates manual setup, credential management, and update handling.

The server supports several transport protocols, selectable via the command line argument --transport <protocol>: - stdio: Default mode if no transport is specified. Uses standard input/output streams, primarily suited for local deployment interacting with a singular client. - streamable-http: Facilitates remote communication over HTTP utilizing a bidirectional streaming channel, enabling continuous message exchange between client and server. Connect via <url>/mcp (e.g., http://localhost:8000/mcp). - sse: Deprecated. Transition to streamable-http. Relies on Server-Sent Events (SSE) for unidirectional event streaming from server to client. Connect via <url>/sse (e.g., http://localhost:8000/sse). - http-compat: A legacy transport supporting both SSE and streamable-http. Currently deployed on remote Keboola services but scheduled for replacement by streamable-http exclusively.

For reliable client-server data transmission, Keboola credentials must be supplied to interact with your project within your specific Keboola Region. Mandatory variables include: KBC_STORAGE_TOKEN, KBC_STORAGE_API_URL, KBC_WORKSPACE_SCHEMA, and optionally KBC_BRANCH_ID.

Credential Provisioning Methods: - Personal Use (Primarily stdio): Set environment variables prior to launching the server. All subsequent requests inherit these static credentials. - Multi-User Context: Embed the required variables within the request headers, ensuring each transaction carries its distinct authorization context.

KBC_STORAGE_TOKEN

This token authenticates your access to Keboola services.

Refer to the official Keboola documentation for generating and managing Storage API tokens.

Guidance: For restricted operational scope, employ a custom storage token; for comprehensive project access, utilize the master token.

KBC_WORKSPACE_SCHEMA

This identifier pinpoints your data processing workspace, essential for SQL execution. This is mandatory ONLY when utilizing a custom access token rather than the Master Token:

  • Master Token Usage: The workspace is automatically provisioned in the background.
  • Custom Token Usage: Follow this Keboola guide to obtain the required KBC_WORKSPACE_SCHEMA.

Important: If creating the workspace manually, ensure the "Grant read-only access to all Project data" option is selected.

Note: In BigQuery workspaces, KBC_WORKSPACE_SCHEMA corresponds to the Dataset Name; simply initiate the connection and retrieve the Dataset Name.

KBC_STORAGE_API_URL (Region Specification)

Your Keboola deployment region dictates the API endpoint URL. Determine your region by observing the URL in your browser when accessing your Keboola project interface:

Region API Endpoint
AWS North America https://connection.keboola.com
AWS Europe https://connection.eu-central-1.keboola.com
Google Cloud EU https://connection.europe-west3.gcp.keboola.com
Google Cloud US https://connection.us-east4.gcp.keboola.com
Azure EU https://connection.north-europe.azure.keboola.com

KBC_BRANCH_ID (Optional Scoping Parameter)

To target a specific Keboola development branch, set the ID via the KBC_BRANCH_ID parameter. The MCP server isolates its operations to this branch, guaranteeing changes do not propagate to the production environment.

  • Default behavior: Production branch is used if this parameter is omitted.
  • Development Scope: Set KBC_BRANCH_ID to the branch's numeric identifier (e.g., 123456). The ID is visible in the UI URL during branch navigation (e.g., .../admin/projects/PROJECT_ID/branch/BRANCH_ID/dashboard).
  • Remote Transport Override: For request-by-request modification, employ the HTTP header X-Branch-Id: <branchId> or KBC_BRANCH_ID: <branchId>.

Software Acquisition

Prerequisites verification:

  • [ ] Python version 3.10 or newer installed
  • [ ] Authorized access to a Keboola project with administrative permissions
  • [ ] Installation of your chosen MCP client (e.g., Claude, Cursor)

Crucial: Ensure the uv package manager is present, as the MCP client will leverage it for automated server download and execution. Installing uv:

macOS/Linux:

# If Homebrew is absent, execute:
# /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Installation via Homebrew
brew install uv

Windows:

# Utilizing the official installer script
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

# Alternatively, via pip
pip install uv

# Or using the Winget utility
winget install --id=astral-sh.uv -e

Further installation methodologies are detailed in the official uv documentation.

Executing the Keboola MCP Server

Four distinct operational modes are available, contingent upon your specific requirements:

In this mode, the MCP server lifecycle is managed automatically by Claude or Cursor. No manual terminal command execution is necessary.

  1. Configure the appropriate settings within your MCP client application.
  2. The client transparently initiates the MCP server process upon requirement.

Claude Desktop Integration Parameters

  1. Access Claude (top-left menu) → Settings → Developer → Edit Config (create claude_desktop_config.json if absent).
  2. Incorporate the following JSON block:
  3. Restart Claude Desktop to apply the modifications.
{
  "mcpServers": {
    "keboola": {
      "command": "uvx",
      "args": ["keboola_mcp_server --transport <transport>"],
      "env": {
        "KBC_STORAGE_API_URL": "https://connection.YOUR_REGION.keboola.com",
        "KBC_STORAGE_TOKEN": "your_keboola_storage_token",
        "KBC_WORKSPACE_SCHEMA": "your_workspace_schema",
        "KBC_BRANCH_ID": "your_branch_id_optional"
      }
    }
  }
}

Configuration File Locations:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json

Cursor Integration Parameters

  1. Navigate to Settings → MCP.
  2. Select the option to "+ Add new global MCP Server".
  3. Apply the following configuration:
{
  "mcpServers": {
    "keboola": {
      "command": "uvx",
      "args": ["keboola_mcp_server --transport <transport>"],
      "env": {
        "KBC_STORAGE_API_URL": "https://connection.YOUR_REGION.keboola.com",
        "KBC_STORAGE_TOKEN": "your_keboola_storage_token",
        "KBC_WORKSPACE_SCHEMA": "your_workspace_schema",
        "KBC_BRANCH_ID": "your_branch_id_optional"
      }
    }
  }
}

Naming Convention: MCP server identifiers should be concise. Due to a combined length constraint (tool name + server name) typically around 60 characters, overly verbose names may be truncated or omitted by the Agent interface.

Cursor Configuration for Windows WSL Environments

If the MCP server is being executed via Windows Subsystem for Linux (WSL) concurrently with Cursor, use this execution wrapper:

{
  "mcpServers": {
    "keboola":{
      "command": "wsl.exe",
      "args": [
          "bash",
          "-c '",
          "export KBC_STORAGE_API_URL=https://connection.YOUR_REGION.keboola.com &&"
          "export KBC_STORAGE_TOKEN=your_keboola_storage_token &&"
          "export KBC_WORKSPACE_SCHEMA=your_workspace_schema &&"
          "export KBC_BRANCH_ID=your_branch_id_optional &&"
          "/snap/bin/uvx keboola_mcp_server --transport <transport>"
          "'"
      ]
    }
  }
}

Option B: Local Source Code Execution Mode

Intended for developers actively modifying the MCP server source code:

  1. Clone the repository and establish the local Python environment.
  2. Direct Claude/Cursor to utilize your local Python interpreter path:
{
  "mcpServers": {
    "keboola": {
      "command": "/absolute/path/to/.venv/bin/python",
      "args": [
        "-m",
        "keboola_mcp_server --transport <transport>"
      ],
      "env": {
        "KBC_STORAGE_API_URL": "https://connection.YOUR_REGION.keboola.com",
        "KBC_STORAGE_TOKEN": "your_keboola_storage_token",
        "KBC_WORKSPACE_SCHEMA": "your_workspace_schema",
        "KBC_BRANCH_ID": "your_branch_id_optional"
      }
    }
  }
}

Option C: Manual Command Line Interface (Testing Purposes Only)

For expedient testing or debugging validation, execution can occur directly within a terminal session:

# Define environmental parameters
export KBC_STORAGE_API_URL=https://connection.YOUR_REGION.keboola.com
export KBC_STORAGE_TOKEN=your_keboola_storage_token
export KBC_WORKSPACE_SCHEMA=your_workspace_schema
export KBC_BRANCH_ID=your_branch_id_optional

uvx keboola_mcp_server --transport sse

Caveat: This manual method defaults to the SSE transport and listens for incoming SSE connections on localhost:8000. The parameters --port and --host can be used to alter the binding address.

Note: Manual execution is strictly for diagnostics. Normal operation with Claude or Cursor mandates using the configuration methods outlined above.

Option D: Containerized Deployment via Docker

docker pull keboola/mcp-server:latest

docker run \
  --name keboola_mcp_server \
  --rm \
  -it \
  -p 127.0.0.1:8000:8000 \
  -e KBC_STORAGE_API_URL="https://connection.YOUR_REGION.keboola.com" \
  -e KBC_STORAGE_TOKEN="YOUR_KEBOOLA_STORAGE_TOKEN" \
  -e KBC_WORKSPACE_SCHEMA="YOUR_WORKSPACE_SCHEMA" \
  -e KBC_BRANCH_ID="YOUR_BRANCH_ID_OPTIONAL" \
  keboola/mcp-server:latest \
  --transport sse \
  --host 0.0.0.0

Note: The container defaults to listening on localhost:8000 using SSE. The port mapping (-p) can be adjusted to redirect traffic elsewhere.

Manual Server Initiation Required?

Usage Scenario Manual Start Necessary? Recommended Configuration Path
Using Claude/Cursor No Configure within the application settings
Local MCP Development No (Client manages start) Point configuration to local Python executable
Ad-hoc CLI Testing Yes Execute commands in a terminal session
Docker Container Usage Yes Run the specified Docker container

Interacting with the MCP Server

Once your chosen MCP client (e.g., Claude/Cursor) is correctly configured and actively running, initiate data requests against your Keboola assets:

Validation Check

Begin with a fundamental query to confirm end-to-end connectivity:

Provide a catalog of all buckets and tables present within my Keboola workspace.

Illustrative Use Cases

Data Discovery:

  • "Identify all stored datasets pertaining to customer records."
  • "Execute a selection query to rank the top ten revenue generators."

Analytical Tasks:

  • "Perform an analysis on quarterly sales metrics broken down by geographical segment."
  • "Determine the statistical correlation between client age cohort and average transaction value."

Data Workflow Automation:

  • "Generate a SQL transformation script that merges customer profiles with transaction logs."
  • "Trigger the data loading job associated with my external Salesforce extractor component."

System Compatibility

Agent Platform Support Matrix

MCP Agent Operational Status Communication Protocol
Claude (Desktop & Web) ✅ Confirmed stdio
Cursor ✅ Confirmed stdio
Windsurf, Zed, Replit ✅ Confirmed stdio
Codeium, Sourcegraph ✅ Confirmed HTTP+SSE
Custom MCP Implementations ✅ Confirmed HTTP+SSE or stdio

Exposed Operational Tools

Agents automatically adapt to the available toolset.

Domain Tool Name Function Description
Project Core get_project_info Outputs structural metadata concerning the Keboola project environment
Storage Layer get_bucket Fetches comprehensive details for a designated storage bucket
get_table Retrieves specifics of a table, including database mapping and schema definition
list_buckets Enumerate all storage buckets within the project scope
list_tables Enumerate all tables housed within a specified bucket
update_description Modify descriptive metadata for buckets, tables, or individual column attributes
SQL Engine query_data Executes arbitrary SQL SELECT statements against the underlying data warehouse
Configuration Mgmt add_config_row Adds a new row entry to an existing component configuration structure
create_config Generates a top-level configuration object for a component
create_sql_transformation Constructs a new SQL transformation based on input SQL code blocks
find_component_id Locates component identifiers matching a descriptive text query
get_component Fetches detailed configuration metadata for a component via its ID
get_config Retrieves the full configuration details for a specified component/transformation
get_config_examples Retrieves sample configuration templates applicable to a component
list_configs Lists all configuration objects, with optional filtering capabilities
list_transformations Lists all defined transformation configurations within the project
update_config Modifies the root definition of a component configuration
update_config_row Modifies a specific configuration row within a component definition
update_sql_transformation Updates the definition of an existing SQL transformation artifact
Flow Orchestration create_conditional_flow Provisions a workflow utilizing the keboola.flow definition
create_flow Provisions a workflow utilizing the legacy keboola.orchestrator definition
get_flow Retrieves the configuration details for a specific workflow
get_flow_examples Retrieves sample definitions for valid workflow structures
get_flow_schema Returns the JSON schema structure applicable to the requested flow type
list_flows Enumerates all configured workflow definitions in the project
update_flow Modifies an extant workflow definition
Job Execution get_job Fetches granular status information for a specific job ID
list_jobs Lists recent jobs, supporting filtering, sorting, and pagination
run_job Initiates an asynchronous execution task for a component or transformation
Data Applications get_data_apps Retrieves details for a specific Data App or lists all Apps in the project.
modify_data_app Creates a new Data App or updates an existing one
deploy_data_app Manages the deployment status (active/suspended) of Streamlit Data Applications
Documentation Access docs_query Provides semantic answers derived exclusively from Keboola platform documentation
Utility create_oauth_url Generates a secure OAuth authorization URI for component setup
search Performs a broad search across project artifacts based on name substrings

Troubleshooting Guide

Frequent Error Resolution

Symptom Remediation Strategy
Authorization Failure Validate the integrity and permissions associated with KBC_STORAGE_TOKEN
Workspace Reference Error Confirm the accuracy of the KBC_WORKSPACE_SCHEMA setting
Connection Interruption Inspect local network connectivity and firewall rules

Development Environment Setup

Initial Dependency Installation

Standard setup:

uv sync --extra dev

Using this baseline, execute uv run tox to run automated tests and conformity checks.

Optimized setup (recommended for full development lifecycle):

uv sync --extra dev --extra tests --extra integtests --extra codestyle

This optimized command installs packages required for rigorous testing and style enforcement, enabling IDEs (like VsCode or Cursor) to accurately lint code and execute localized test suites.

Integration Testing Execution

To execute local integration validation suites, use the command: uv run tox -e integtests. NOTE: This process necessitates the presence of the following environmental variables:

  • INTEGTEST_STORAGE_API_URL
  • INTEGTEST_STORAGE_TOKEN
  • INTEGTEST_WORKSPACE_SCHEMA

These required values must be sourced from a dedicated Keboola project reserved exclusively for integration testing purposes.

Lock File Management

When dependencies are added or removed, the uv.lock manifest must be regenerated. For release preparation, consider updating existing package versions using uv lock --upgrade.

Documentation Synchronization

If modifications are made to any tool descriptions (i.e., docstrings within the tool functions), the TOOLS.md artifact must be regenerated to mirror these functional updates:

uv run python -m src.keboola_mcp_server.generate_tool_docs

Support Channels and Feedback Submission

For reporting defects, proposing enhancements, or seeking assistance, the designated primary pathway is by submitting a new issue on GitHub.

The development contributors actively monitor the issue tracker and commit to providing timely responses. For general inquiries regarding the broader Keboola platform, consult the subsequent resources.

Reference Materials

Connectivity

WIKIPEDIA REFERENCE: XMLHttpRequest (XHR) defines an Application Programming Interface, embodied as a JavaScript object, designed to dispatch HTTP requests from a web browser context to a remote web server. Its methods enable browser-based applications to submit queries to the server post-page-load and receive asynchronous responses. XHR is foundational to Ajax programming paradigms. Before its advent, server interaction relied predominantly on standard hyperlink navigation and form submissions, often leading to full-page refreshes. == Genesis == The conceptual foundation for XMLHttpRequest was established in 2000 by the development team at Microsoft Outlook. This concept was subsequently implemented within the Internet Explorer 5 browser release (1999). However, the initial invocation syntax did not standardize on the XMLHttpRequest identifier. Instead, developers relied on ActiveXObject("Msxml2.XMLHTTP") or ActiveXObject("Microsoft.XMLHTTP"). As of Internet Explorer 7 (2006), universal browser adoption of the XMLHttpRequest identifier was achieved. The XMLHttpRequest identifier has since solidified as the prevailing standard across all major browser engines, including Mozilla’s Gecko (2002), Safari 1.2 (2004), and Opera 8.0 (2005). === Standardization === The World Wide Web Consortium (W3C) published the initial Working Draft specification for the XMLHttpRequest object on April 5, 2006. A Level 2 specification, introducing mechanisms for monitoring request progress, enabling cross-origin communication, and handling binary data streams, followed on February 25, 2008. By the close of 2011, the Level 2 features were merged back into the primary specification. Since late 2012, development stewardship transitioned to WHATWG, which maintains the living document using Web IDL definitions. == Operational Procedure == Sending a request via XMLHttpRequest generally involves a sequence of programming steps: Instantiate an XMLHttpRequest object via its constructor. Invoke the open method to define request type (GET/POST), specify the target resource URI, and select synchronous or asynchronous execution mode. For asynchronous operations, establish a listener callback function to handle state transitions. Commence the transmission by calling the send method. Monitor the state changes within the registered event handler. Upon server response completion, the state transitions to 4 (the "done" state), and retrieved data is typically available in the responseText property. Beyond these core steps, XHR offers extensive control over request dispatch and response processing. Custom header fields can be injected to guide server processing, and data payloads can be uploaded using the argument provided to the send call. Responses can be pre-parsed from JSON into native JavaScript objects or processed incrementally as they arrive. Furthermore, requests can be forcibly terminated (abort) or configured to time out if completion is delayed beyond a set threshold. == Cross-Domain Interactions == Early in the World Wide Web's evolution, methods were discovered that allowed circumvention of the same-origin security policy, leading to significant architectural considerations regarding data access boundaries.

See Also

`