logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

mlc-data-fabricator

Facilitates interaction with the MLC Bakery service suite via an MCP-compliant abstraction layer. Enables programmatic discovery of data assets, retrieval of sample records, and verification of descriptive attributes. Supports robust data exploration workflows interfacing with the core MLC Bakery API infrastructure.

Author

mlc-data-fabricator logo

jettyio

MIT License

Quick Info

GitHub GitHub Stars 5
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

apisbakeryapibakery apimlc bakerymlcbakery provides

MLC Data Fabricator Service

A core service, engineered in Python leveraging FastAPI and SQLAlchemy, designed to govern the lifecycle and lineage of machine learning artifacts. It possesses built-in capability to validate metadata conforming to the Croissant standard.

Key Capabilities

  • Comprehensive asset organization, including grouping mechanisms (collections)
  • Tracking and auditing of data entities
  • Chronological activity ledger maintenance
  • Mapping of dependency and ancestry relationships
  • Exposure via standardized Representational State Transfer (REST) interfaces

Deployment via Containerization

  1. Configuration Setup: Duplicate the template file to establish runtime environment variables: bash cp env.example .env

  2. Orchestration Startup: The fabrication unit requires PostgreSQL for persistent storage and Typesense for indexed searching. The MCP intermediary layer communicates with the primary API service, which in turn interacts with the underlying data persistence layer.

    docker compose up -d

  3. Schema Initialization: Execute necessary database schema modifications using Alembic. The uv run utility executes required operational commands within the controlled project environment. bash docker compose exec db psql -U postgres -c "create DATABASE mlcbakery;" docker compose exec api alembic upgrade head

Service Access Points

The primary API endpoint is accessible by default on the local loopback interface. - Interactive Documentation (Swagger UI): http://bakery.localhost/docs - Alternative Documentation (ReDoc): http://bakery.localhost/redoc - MCP Stream Interface: http://mcp.localhost/mcp (Local host file modification may be necessary for seamless connectivity during local build validation)

Local Service Execution

Prerequisites for Local Build

  • Runtime environment must possess Python version 3.12 or newer
  • The uv package manager utility must be installed (Reference Link)

Setup Procedure

  1. Source Code Acquisition: bash git clone git@github.com:jettyio/mlcbakery.git cd mlcbakery

  2. Dependency Installation: uv utilizes the declarative dependency definitions found in pyproject.toml. It automatically provisions an isolated execution context if one is not present. bash curl -LsSf https://astral.sh/uv/install.sh | sh

    pip install poetry uvicorn uv run poetry install --no-interaction --no-ansi --no-root --with mcp

Invoke the FastAPI application server using uvicorn: bash

Verify that the DATABASE_URL configuration is present in your .env file

uv run uvicorn mlcbakery.main:app --reload --host 0.0.0.0 --port 8000

Security Mechanisms

The Bakery enforces request authentication via two distinct methods: standard JSON Web Tokens (JWT) and a designated "Master Administrator Credential." Both authentication vectors are configured via environment settings found in the .env artifact. For both JWTs and the Master Credential, they must be presented in the HTTP request's authorization header, prefixed with the scheme "Bearer".

  • ADMIN_AUTH_TOKEN: A static secret that grants elevated, unrestricted access to all service resources.
  • JWT_VERIFICATION_STRATEGY: The uniform resource locator (URL) pointing to a trusted authority for validating JWT signatures (e.g., Clerk). A development instance of Clerk is accessible; users can self-register via flows.jetty.io (experimental access) or contact dev@jetty.io for organizational cloud access.

Running Validation Suites

The internal test suite is configured to interact with a PostgreSQL instance specified by the DATABASE_URL environment variable. You may reuse the connection parameters from your development setup or define a distinct database endpoint within .env for isolated testing (adjust connection string accordingly).

bash

Confirm DATABASE_URL variable is established in the execution shell or .env file

uv run pytest

See Also

`