logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

dataset-viewer

Browse and analyze Hugging Face datasets with features like search, filtering, statistics, and data export

Author

dataset-viewer logo

privetin

MIT License

Quick Info

GitHub GitHub Stars 28
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

datasetsdatasetfaceface datasetsprivetin datasetdataset viewer

Dataset Viewer MCP Server

An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.

Features

Resources

  • Uses dataset:// URI scheme for accessing Hugging Face datasets
  • Supports dataset configurations and splits
  • Provides paginated access to dataset contents
  • Handles authentication for private datasets
  • Supports searching and filtering dataset contents
  • Provides dataset statistics and analysis

Tools

The server provides the following tools:

  1. validate
  2. Check if a dataset exists and is accessible
  3. Parameters:

    • dataset: Dataset identifier (e.g. 'stanfordnlp/imdb')
    • auth_token (optional): For private datasets
  4. get_info

  5. Get detailed information about a dataset
  6. Parameters:

    • dataset: Dataset identifier
    • auth_token (optional): For private datasets
  7. get_rows

  8. Get paginated contents of a dataset
  9. Parameters:

    • dataset: Dataset identifier
    • config: Configuration name
    • split: Split name
    • page (optional): Page number (0-based)
    • auth_token (optional): For private datasets
  10. get_first_rows

  11. Get first rows from a dataset split
  12. Parameters:

    • dataset: Dataset identifier
    • config: Configuration name
    • split: Split name
    • auth_token (optional): For private datasets
  13. get_statistics

  14. Get statistics about a dataset split
  15. Parameters:

    • dataset: Dataset identifier
    • config: Configuration name
    • split: Split name
    • auth_token (optional): For private datasets
  16. search_dataset

  17. Search for text within a dataset
  18. Parameters:

    • dataset: Dataset identifier
    • config: Configuration name
    • split: Split name
    • query: Text to search for
    • auth_token (optional): For private datasets
  19. filter

  20. Filter rows using SQL-like conditions
  21. Parameters:

    • dataset: Dataset identifier
    • config: Configuration name
    • split: Split name
    • where: SQL WHERE clause (e.g. "score > 0.5")
    • orderby (optional): SQL ORDER BY clause
    • page (optional): Page number (0-based)
    • auth_token (optional): For private datasets
  22. get_parquet

  23. Download entire dataset in Parquet format
  24. Parameters:
    • dataset: Dataset identifier
    • auth_token (optional): For private datasets

Installation

Prerequisites

  • Python 3.12 or higher
  • uv - Fast Python package installer and resolver

Setup

  1. Clone the repository:
git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
  1. Create a virtual environment and install:
# Create virtual environment
uv venv

# Activate virtual environment
# On Unix:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

# Install in development mode
uv add -e .

Configuration

Environment Variables

  • HUGGINGFACE_TOKEN: Your Hugging Face API token for accessing private datasets

Claude Desktop Integration

Add the following to your Claude Desktop config file:

On Windows: %APPDATA%\Claude\claude_desktop_config.json

On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "dataset-viewer": {
      "command": "uv",
      "args": [
        "--directory",
        "parent_to_repo/dataset-viewer",
        "run",
        "dataset-viewer"
      ]
    }
  }
}

License

MIT License - see LICENSE for details

See Also

`