Cryo Ethereum Ledger Data Access Service 🧊 (MCP)

This service acts as an MCP gateway for the high-performance Cryo data retrieval utility. By exposing Cryo's capabilities through a standardized MCP server interface, querying complex blockchain states becomes straightforward for any compatible client.

LLM Workflow: Executing Structured Query Language (SQL) Operations

To leverage SQL capabilities against the extracted blockchain artifacts, adhere to this three-phase procedure:

Artifact Retrieval using query_dataset: python retrieval_result = query_dataset( dataset="blocks", # Specify ledger component: 'transactions', 'logs', etc. blocks="15000000:15001000", # Define block span or use blocks_from_latest=100 output_format="parquet" # Essential: Parquet format required for efficient SQL processing ) artifact_paths = retrieval_result.get("files", []) # Secure the list of generated file references
Schema Inspection via get_sql_table_schema: python # Inspect the structure of the retrieved data files data_schema = get_sql_table_schema(artifact_paths[0]) # Review available fields, types, and sample data points
SQL Execution using query_sql: python # Method A: Rely on DuckDB's automatic file association via table name sql_output = query_sql( query="SELECT block_number, timestamp, gas_used FROM blocks", files=artifact_paths # Supply all relevant file locations )

# Method B: Explicitly invoke Parquet reading within the query string sql_output = query_sql( query=f"SELECT block_number, timestamp, gas_used FROM read_parquet('{artifact_paths[0]}')", files=artifact_paths )

Alternatively, utilize the monolithic function query_blockchain_sql which bundles extraction and querying: python

Integrated approach example

integrated_result = query_blockchain_sql( sql_query="SELECT * FROM transactions WHERE value > 0", dataset="transactions", blocks_from_latest=500 )

Consult examples/sql_workflow_example.py for a comprehensive demonstration.

Core Capabilities

Complete Cryo Coverage: Access to the full spectrum of Cryo's extracted ledger components.
MCP Native: Full compatibility across the Model Completion Protocol ecosystem.
Granular Control: Supports all established Cryo input parameters for filtering and range definition.
Block Scope Definition: Utilize absolute numbers, latest block references, or relative offsets.
Entity Scoping: Precise data isolation based on contract identifiers.
Immediacy: Direct retrieval pointers for the most recently finalized block.
Format Versatility: Data output supported in JSON, CSV, and optimized Parquet.
Metadata Provision: Retrieval of detailed component schemas and representative samples.
Direct SQL: Ability to execute arbitrary SQL commands against materialized blockchain records.

Setup Prerequisites (Optional)

If executing locally via uvx, this step is bypassed.

bash

Installation via UV package manager (recommended)

uv tool install cryo-mcp

System Dependencies

Python Interpreter version 3.8 or newer.
The uv package manager.
A functional installation of the Cryo backend system.
Credentials for an accessible Ethereum RPC endpoint.
DuckDB runtime environment (mandatory for SQL operations).

Initial Operation (Cloud Context)

Initiate interactive setup via claude mcp add.
Specify uvx as the execution binary.
Input arguments: cryo-mcp --rpc-url <ETH_RPC_URL> [--data-dir <DATA_DIR>]
Configuration can alternatively be supplied via environment variables (ETH_RPC_URL, CRYO_DATA_DIR).

This establishes a persistent connection profile for claude to interact with your configured Cryo instance.

Exposed MCP Interfaces

Cryo MCP makes the following functions available:

`list_datasets()`

Enumerates all discoverable data sets managed by Cryo.

Example: python client.list_datasets()

`query_dataset()`

Initiates the extraction of a specified ledger component based on constraints.

Parameters: - dataset (str): The target component name ('blocks', 'transactions', 'logs'). - blocks (str, optional): String defining the block interval (e.g., '1000:1010'). - start_block (int, optional): Lower bound block index (alternative to blocks). - end_block (int, optional): Upper bound block index (alternative to blocks). - use_latest (bool, optional): If true, targets the current tip of the chain. - blocks_from_latest (int, optional): Count of recent blocks to process. - contract (str, optional): Filter results strictly to events or data associated with this address. - output_format (str, optional): Serialization type ('json', 'csv', 'parquet'). - include_columns (list, optional): Explicitly request certain fields. - exclude_columns (list, optional): Explicitly omit default fields.

Example: python

Retrieve transactions spanning two specific block milestones

client.query_dataset('transactions', blocks='15M:15.01M')

client.query_dataset('logs', blocks_from_latest=100, contract='0x1234...')

Obtain metadata for the very latest block

client.query_dataset('blocks', use_latest=True)

`lookup_dataset()`

Retrieves metadata, including schema definition and sample data, for a component.

Parameters: - name (str): Identifier of the data component to investigate. - sample_start_block (int, optional): Block index to begin sample capture. - sample_end_block (int, optional): Block index to conclude sample capture. - use_latest_sample (bool, optional): Base sample generation on the latest block. - sample_blocks_from_latest (int, optional): Number of recent blocks for sample extraction.

Example: python client.lookup_dataset('logs')

`get_latest_ethereum_block()`

Returns the current finalized block header details.

Example: python client.get_latest_ethereum_block()

SQL Querying Interfaces

Cryo MCP furnishes specialized interfaces for executing analytical queries over retrieved data:

`query_sql()`

Executes a user-defined SQL statement against locally provisioned data artifacts.

Parameters: - query (str): The complete SQL expression to be evaluated. - files (list, optional): List of file paths containing the data sources. If omitted, the tool scans the configured data directory. - include_schema (bool, optional): Flag to embed schema details within the output response.

Example: python

Query executed against all known parquet segments

client.query_sql("SELECT * FROM read_parquet('/data/segment_001.parquet') LIMIT 10")

Targeted execution against a specific artifact file

client.query_sql( "SELECT block_number FROM read_parquet('/data/blocks.parquet') WHERE difficulty > 1000000000000000", files=['/data/blocks.parquet'] )

`query_blockchain_sql()`

Streamlines the process: it handles data acquisition based on parameters, then executes the supplied SQL query.

Parameters: - sql_query (str): The analytical SQL command. - dataset (str, optional): The source ledger component ('blocks', 'transactions', etc.). - blocks (str, optional): Block range specification string. - start_block (int, optional): Starting block index. - end_block (int, optional): Ending block index. - use_latest (bool, optional): Target the latest block. - blocks_from_latest (int, optional): Number of trailing blocks to process. - contract (str, optional): Filter data extraction by contract address. - force_refresh (bool, optional): Bypass cache checks and re-download data. - include_schema (bool, optional): Request schema inclusion in the final output.

Example: python

Automatically pulls the last 100 blocks of transaction data, then runs the analysis

client.query_blockchain_sql( sql_query="SELECT to_address, SUM(gas_used) FROM transactions GROUP BY to_address ORDER BY 2 DESC LIMIT 5", dataset="transactions", blocks_from_latest=100 )

`list_available_sql_tables()`

Reports all recognized entities ready for direct SQL access within the current data store.

Example: python client.list_available_sql_tables()

`get_sql_table_schema()`

Retrieves the structural blueprint for a specified materialized data file.

Parameters: - file_path (str): Absolute or relative path to the target parquet file.

Example: python client.get_sql_table_schema("/persisted/logs_007.parquet")

`get_sql_examples()`

Provides illustrative SQL snippets tailored for common blockchain data analysis patterns.

Example: python client.get_sql_examples()

Server Configuration Parameters

When launching the Cryo MCP node, these command-line arguments modify operational settings:

--rpc-url URL: Specifies the Ethereum network access point (overrides ETH_RPC_URL env var).
--data-dir PATH: Defines the persistence location for downloaded ledger artifacts (overrides CRYO_DATA_DIR, defaults to a local hidden directory).

Environment Variable Mapping

ETH_RPC_URL: Default endpoint for RPC communication, used if --rpc-url is absent.
CRYO_DATA_DIR: Default repository for data caching, used if --data-dir is absent.

Advanced Operation Scenarios

Decomposed SQL Analysis on Ledger Data

This method decouples data acquisition from computation, offering granular control:

python

Phase 1: Acquisition and serialization to Parquet

download_manifest = client.query_dataset( dataset="transactions", blocks_from_latest=1000, output_format="parquet" )

Phase 2: SQL evaluation using returned file references

reference_list = download_manifest.get("files", []) client.query_sql( query=f""" SELECT to_address AS recipient_id, COUNT(*) AS transaction_count, SUM(gas_used) AS accumulated_gas, AVG(gas_used) AS mean_gas FROM read_parquet('{reference_list[0]}') WHERE to_address IS NOT NULL GROUP BY recipient_id ORDER BY accumulated_gas DESC LIMIT 20 """, files=reference_list )

Unified SQL Execution Pathway

For rapid analysis, use the combined tool to manage internal data fetching:

python

Identify high-throughput contract interactions

client.query_blockchain_sql( sql_query=""" SELECT to_address, COUNT(*) as interaction_volume, SUM(value) as total_value_transferred FROM read_parquet('/data/txns.parquet') WHERE to_address IS NOT NULL AND value > 0 GROUP BY to_address ORDER BY interaction_volume DESC LIMIT 20 """, dataset="transactions", blocks_from_latest=1000 )

Locate blocks that processed the maximum volume of transactions

client.query_blockchain_sql( sql_query=""" SELECT block_number, COUNT(*) as contained_tx_count FROM read_parquet('/data/txns.parquet') GROUP BY block_number ORDER BY contained_tx_count DESC LIMIT 10 """, dataset="transactions", blocks="15M:16M" )

Crucial Instruction: For optimal performance in SQL evaluation via DuckDB, ensure output_format="parquet" is specified during the data retrieval phase. When utilizing query_blockchain_sql, reference the persisted files within your query string explicitly using the read_parquet() function.

Specifying Block Ranges

This service fully recognizes Cryo's block addressing syntax:

python

Use explicit numerical indices

client.query_dataset('transactions', blocks='15000000:15001000')

Utilize 'M' (Million) notation for large spans

client.query_dataset('logs', blocks='15M:15.01M')

Fetch data relative to the current chain tip

client.query_dataset('blocks', blocks_from_latest=100)

Contract-Specific Data Isolation

Filter output streams to data relevant only to a specific contract address:

python

Extract event logs for a known stablecoin contract

client.query_dataset('logs', blocks='16M:16.1M', contract='0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48')

Field Projection Control

Minimize data transfer overhead by requesting only necessary data fields:

python

Secure only the block ID and timestamp fields for a range

client.query_dataset('blocks', blocks='16M:16.1M', include_columns=['number', 'timestamp'])

Development Repository Structure

cryo-mcp/ ├── cryo_mcp/ # Core Python modules │ ├── init.py # Package initialization module │ ├── server.py # Primary MCP server logic │ ├── sql.py # Module handling DuckDB interactions ├── tests/ # Unit and integration test suite │ ├── test_*.py # Individual test scripts ├── pyproject.toml # Project metadata and dependency definitions ├── README.md # Current documentation file

Executing Validation Suite

Run comprehensive tests using: uv run pytest

Licensing

This utility is released under the MIT License.

Acknowledgment

Service foundation built upon the exceptional Cryo data engine from Paradigm.
Adheres to the standardized MCP protocol for interface definitions.

cryo-mcp-data-extractor

Author

z80dev

Quick Info

Actions

Tags

Cryo Ethereum Ledger Data Access Service 🧊 (MCP)

LLM Workflow: Executing Structured Query Language (SQL) Operations

Integrated approach example

Core Capabilities

Setup Prerequisites (Optional)

Installation via UV package manager (recommended)

System Dependencies

Initial Operation (Cloud Context)

Exposed MCP Interfaces

list_datasets()

query_dataset()

Retrieve transactions spanning two specific block milestones

Fetch logs related to a specific address over the last 100 finalized blocks

Obtain metadata for the very latest block

lookup_dataset()

get_latest_ethereum_block()

SQL Querying Interfaces

query_sql()

Query executed against all known parquet segments

Targeted execution against a specific artifact file

query_blockchain_sql()

Automatically pulls the last 100 blocks of transaction data, then runs the analysis

list_available_sql_tables()

get_sql_table_schema()

get_sql_examples()

Server Configuration Parameters

Environment Variable Mapping

Advanced Operation Scenarios

Decomposed SQL Analysis on Ledger Data

Phase 1: Acquisition and serialization to Parquet

Phase 2: SQL evaluation using returned file references

Unified SQL Execution Pathway

Identify high-throughput contract interactions

Locate blocks that processed the maximum volume of transactions

Specifying Block Ranges

Use explicit numerical indices

Utilize 'M' (Million) notation for large spans

Fetch data relative to the current chain tip

Contract-Specific Data Isolation

Extract event logs for a known stablecoin contract

Field Projection Control

Secure only the block ID and timestamp fields for a range

Development Repository Structure

Executing Validation Suite

Licensing

Acknowledgment

See Also

`list_datasets()`

`query_dataset()`

`lookup_dataset()`

`get_latest_ethereum_block()`

`query_sql()`

`query_blockchain_sql()`

`list_available_sql_tables()`

`get_sql_table_schema()`

`get_sql_examples()`