PubChemDataAccessor-MCP
Interface to query and fetch comprehensive chemical entity data from the PubChem database. This service provides programmatic access to molecular attributes, structural representations, and physicochemical metrics via a streamlined Model Context Protocol endpoint.
Author

JackKuo666
Quick Info
Actions
Tags
PubChem Data Access Service (MCP)
🧪 Empower intelligent agents with direct, structured access to PubChem's vast repository of chemical structures and properties through a standardized MCP interface.
This PubChem MCP Service acts as a robust intermediary, allowing AI models to execute complex chemical database lookups and retrieve detailed compound specifications programmatically.
🤝 Submit Contributions • 🐞 Log Issues
✨ Primary Capabilities
- 🔍 Chemical Entity Lookup: Query PubChem records using IUPAC names, SMILES strings, or Compound Identifiers (CID) ✅
- ⚛️ Structural Data Retrieval: Obtain molecular geometries and canonical identifiers ✅
- 📈 Property Extraction: Access aggregated chemical, physical, and thermodynamic data ✅
- ⚙️ Compound Synthesis Queries: Formulate multi-criteria searches for precise data fetching ✅
- 🖼️ Molecular Visualization Hooks: Functionality to generate visual representations of structures 📝
- 🧮 Comparative Analysis: Tools for cross-compound property benchmarking 📝
- 💾 Local Caching: Mechanism to persist frequently accessed compound data for low-latency retrieval 📝
- 🧠 Specialized Chemistry Workflows: Pre-defined analytical routines for complex chemical assessments 📝
🚀 Deployment Guide
Installation via Smithery Utility
Install the PubChem Server for your AI client platform automatically using Smithery:
Claude Client
bash npx -y @smithery/cli@latest install @JackKuo666/pubchem-mcp-server --client claude --config "{}"
Cursor Client
Place the following configuration into your Cursor Settings under MCP → Add new server: - Mac/Linux Terminal Command: s npx -y @smithery/cli@latest run @JackKuo666/pubchem-mcp-server --client cursor --config "{}"
Windsurf
sh npx -y @smithery/cli@latest install @JackKuo666/pubchem-mcp-server --client windsurf --config "{}"
CLine Integration
sh npx -y @smithery/cli@latest install @JackKuo666/pubchem-mcp-server --client cline --config "{}"
Manual Setup (Local Environment)
Install dependencies using uv:
bash uv tool install pubchem-mcp-server
For development environment setup:
bash
Clone repository
git clone https://github.com/JackKuo666/PubChem-MCP-Server.git cd PubChem-MCP-Server
Initialize and activate virtual environment
uv venv source .venv/bin/activate uv pip install -r requirements.txt
📊 Operational Examples
Execute the MCP interface server:
bash python pubchem_server.py
Once operational, interact with the service via your AI assistant. Below are functional Python examples demonstrating tool invocation:
Example 1: Name-based Compound Search
python result = await mcp.use_tool("search_pubchem_by_name", { "name": "aspirin", "max_results": 3 }) print(result)
Example 2: SMILES String Query
python result = await mcp.use_tool("search_pubchem_by_smiles", { "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O", # Acetylsalicylic Acid SMILES "max_results": 2 }) print(result)
Example 3: Retrieval by Compound Identifier
python result = await mcp.use_tool("get_pubchem_compound_by_cid", { "cid": 2244 # CID for Aspirin }) print(result)
Example 4: Complex Parameterized Search
python result = await mcp.use_tool("search_pubchem_advanced", { "name": "caffeine", "formula": "C8H10N4O2", "max_results": 2 }) print(result)
These examples illustrate the primary interaction patterns for the four core functionalities offered by the PubChem MCP Service.
🛠️ Available MCP Functions
The PubChem MCP Service exposes the following callable functions:
search_pubchem_by_name
Locates chemical entities in PubChem based on their common or systematic name.
Arguments:
- name (string): The chemical name string to search for.
- max_results (integer, optional): Upper bound on the number of matching records returned (Default: 5).
Output: A list of record summaries.
search_pubchem_by_smiles
Performs a substructure or exact match search using SMILES notation.
Arguments:
- smiles (string): Canonical or isomeric SMILES representation.
- max_results (integer, optional): Maximum records to return (Default: 5).
Output: A list of record summaries.
get_pubchem_compound_by_cid
Fetches comprehensive metadata for a compound given its unique PubChem CID.
Arguments:
- cid (integer): The specific PubChem Compound ID.
Output: A single, detailed compound metadata object.
search_pubchem_advanced
Executes a flexible search combining multiple attribute filters.
Arguments:
- name (string, optional): Filter by name.
- smiles (string, optional): Filter by SMILES.
- formula (string, optional): Filter by molecular formula.
- cid (integer, optional): Filter by Compound ID.
- max_results (integer, optional): Limit the result set size (Default: 5).
Output: A list of matching record summaries.
💻 Client Configuration Snippets
To enable this service within desktop AI clients, integrate the following configuration blocks after restarting the application:
Claude Desktop Integration (Mac OS)
{ "mcpServers": { "pubchem": { "command": "python", "args": ["-m", "pubchem-mcp-server"] } } }
Claude Desktop Integration (Windows)
{ "mcpServers": { "pubchem": { "command": "C:\Users\YOUR_USERNAME\AppData\Local\Programs\Python\Python311\python.exe", "args": [ "-m", "pubchem-mcp-server" ] } } }
Cline Configuration Example
{ "mcpServers": { "pubchem": { "command": "bash", "args": [ "-c", "source /home/YOUR/PATH/mcp-hub/PubChem-MCP-Server/.venv/bin/activate && python /home/YOUR/PATH/mcp-hub/PubChem-MCP-Server/pubchem_server.py" ], "env": {}, "disabled": false, "autoApprove": [] } } }
Once configured and running, AI interactions will gain chemical data access capabilities, such as:
- Compound Discovery: "Query PubChem for all compounds matching the name 'aspirin'."
- Detail Lookup: "Retrieve the full properties, including InChIKey, for CID 2244."
📝 Future Enhancements (Roadmap)
visualize_compound
Implement functionality to generate and render 2D or 3D visual models of chemical structures.
compare_compounds
Develop routines for side-by-side comparison of physicochemical profiles across multiple specified chemical entities.
save_compound
Introduce persistence mechanisms to store query results locally.
list_saved_compounds
Utility function to enumerate locally stored compound records.
🧠 Advanced Chemical Analysis Workflows
Integrate specialized, multi-step analysis prompts leveraging the core search capabilities:
Comprehensive Compound Profiling Prompt
This automated workflow, triggered via a specific prompt call, standardizes deep analysis using only a compound identifier:
python result = await call_prompt("deep-compound-analysis", { "compound_id": "2244" })
This internal execution sequence will cover:
- Structural identity and core properties assessment
- Known pharmacological profiles
- Reported biological interactions
- Industrial or research applications
- Toxicity and safety data summaries
- Identification of structurally similar analogs
📁 Repository Structure Overview
pubchem_server.py: Core execution script handling the FastMCP server logic.pubchem_search.py: Module containing abstracted functions for PubChem API interaction.
🔧 Required Components
Ensure the following prerequisites are satisfied:
- Python Interpreter version 3.10 or newer
- FastMCP Framework
- asyncio Library
- Logging Utilities
pubchempy(for direct PubChem API interfacing)pandas(for structured data manipulation)
Installation command for dependencies:
bash pip install mcp pubchempy pandas
🤝 Community Engagement
We welcome community contributions! Please feel free to open pull requests for features or fixes.
📄 Licensing
This software is distributed under the terms of the MIT License.
🛡️ Cautionary Note
This tool is provided strictly for research and development purposes. Users must adhere to PubChem's usage policies and exercise responsible data access.
