youtube-operator-suite
A comprehensive toolkit, implemented as an MCP endpoint and command-line interface, designed for the programmatic management and automation of all facets of the YouTube platform.
Author
eat-pray-ai
Quick Info
Actions
Tags
youtube-operator-suite (Formerly yutu)
This utility, youtube-operator-suite (formerly known by its codename yutu), functions as a complete Model Context Protocol (MCP) compliant service layer and a dedicated command-line interface (CLI) for orchestrating operations across the YouTube platform. It provides granular control over nearly all manipulable YouTube assets, encompassing videos, curated playlists, user channels, user-generated comments, subtitle tracks, and more. 中文文档
Essential Prerequisites
Prior to deployment, access to a Google Cloud Platform account is mandatory for provisioning a Project and activating the necessary service endpoints via the APIs & Services -> Enable APIs and services -> + ENABLE APIS AND SERVICES interface.
Following API activation, configure an OAuth consent screen, designating yourself as a test user. Subsequently, generate an OAuth Client ID of the type Web Application, ensuring the specified redirect URI is set to http://localhost:8216.
Secure these credentials by saving them locally as client_secret.json. The expected structure mirrors this example:
{ "web": { "client_id": "11181119.apps.googleusercontent.com", "project_id": "yutu-11181119", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_secret": "XXXXXXXXXXXXXXXX", "redirect_uris": [ "http://localhost:8216" ] } }
To validate the configuration, execute the authentication sequence:
shell ❯ youtube-operator-suite auth --credential client_secret.json
A browser interface will prompt for YouTube account authorization. Upon approval, an access token will be generated and persisted in youtube.token.json:
{ "access_token": "ya29.XXXXXXXXX", "token_type":"Bearer", "refresh_token":"1//XXXXXXXXXX", "expiry":"2024-05-26T18:49:56.1911165+08:00" }
By default, the utility retrieves client_secret.json and youtube.token.json from the execution directory. To override these default file locations across all subcommands, utilize the environment variable exports:
shell ❯ export YOUTU_CREDENTIAL_PATH=client_secret.json ❯ export YOUTU_ACCESS_TOKEN_CACHE=youtube.token.json
Alternatively, pass directly via command invocation:
❯ YOUTU_CREDENTIAL_PATH=client_secret.json YOUTU_ACCESS_TOKEN_CACHE=youtube.token.json youtube-operator-suite subcommand --flag value
Deployment
You may retrieve the binary from the official releases repository or utilize one of the following automated installation mechanisms.
CI/CD Integration (GitHub Actions)
Two specialized Actions are provided: one for general system control and another dedicated to video asset uploading. Consult the documentation for youtube-action and youtube-uploader for deployment specifics.
Containerization (Docker)
shell ❯ docker pull ghcr.io/eat-pray-ai/yutu:latest ❯ docker run --rm ghcr.io/eat-pray-ai/yutu:latest
Ensure client_secret.json is available in the working directory context
❯ docker run --rm -it -u $(id -u):$(id -g) -v $(pwd):/app ghcr.io/eat-pray-ai/yutu:latest auth
Go Ecosystem Installation (Gopher)
shell ❯ go install github.com/eat-pray-ai/yutu@latest
Operating System Specific Installation
Linux
shell ❯ curl -sSfL https://raw.githubusercontent.com/eat-pray-ai/yutu/main/scripts/install.sh | bash
macOS
Installation via Homebrew🍺 is the preferred method, or alternatively use the general shell script.
shell ❯ brew install yutu
Alternative shell script method
❯ curl -sSfL https://raw.githubusercontent.com/eat-pray-ai/yutu/main/scripts/install.sh | bash
Windows
shell ❯ winget install yutu
Integrity Verification
Confirm the authenticity and source of the installed utility via its cryptographic signing attestations.
shell
For Docker deployments
❯ gh attestation verify oci://ghcr.io/eat-pray-ai/yutu:latest --repo eat-pray-ai/yutu
For Linux/macOS installations via script
❯ gh attestation verify $(which youtube-operator-suite) --repo eat-pray-ai/yutu
For Windows installations
❯ gh attestation verify $(where.exe youtube-operator-suite.exe) --repo eat-pray-ai/yutu
MCP Service Endpoint
As an officially recognized MCP server, this utility facilitates interaction with YouTube resources through natural language prompts within compatible MCP clients such as Claude Desktop, Visual Studio Code, or Cursor.
Ensure prior installation (refer to Deployment) and valid authentication credentials (client_secret.json and youtube.token.json from Essential Prerequisites) are present.
Configuration can be automated via the badges above, or manually inserted into your client's configuration structure. Remember to substitute the placeholder paths for YUTU_CREDENTIAL and YUTU_CACHE_TOKEN with the absolute locations on your host system.
{ "yutu": { "type": "stdio", "command": "yutu", "args": [ "mcp" ], "env": { "YUTU_CREDENTIAL": "/absolute/path/to/client_secret.json", "YUTU_CACHE_TOKEN": "/absolute/path/to/youtube.token.json" } } }
Operational Commands
shell
❯ youtube-operator-suite
youtube-operator-suite is a fully functional MCP server and CLI for YouTube, which can manipulate almost all YouTube resources
Usage: youtube-operator-suite [flags] youtube-operator-suite [command]
Available Commands: activity Query YouTube activity feeds auth Establish credentials with the YouTube API caption Manage subtitle and caption tracks channel Retrieve and modify channel metadata channelBanner Upload and set the primary channel graphic asset channelSection Configure layout sections on a channel page comment Manage user comments commentThread Manage comment threads associated with content completion Generate shell autocompletion scripts help Display context-specific help information i18nLanguage List supported internationalization languages i18nRegion List supported internationalization regions mcp Initiate the Model Context Protocol service endpoint member Fetch details of channel members membershipsLevel Query details on membership tiers/levels playlist Manage video collections (playlists) playlistImage Update playlist cover images playlistItem Manage individual entries within playlists search Execute searches across YouTube resources subscription Manage current user subscriptions superChatEvent Log Super Chat contributions received by a channel thumbnail Apply custom preview images to videos version Display the current software release version video Perform operations on uploaded videos videoAbuseReportReason List standardized reasons for content flagging videoCategory List standardized video content categories watermark Configure channel branding watermarks
Flags: -h, --help Display help message
Execute "youtube-operator-suite [command] --help" for more granular command documentation.
Key Capabilities
See FEATURES.md for an exhaustive enumeration of supported functionalities.
Development Guidelines
Please consult CONTRIBUTING.md for contribution protocols.
Project Popularity Trajectory
Wikipedia Insight on Browser Automation Context:
A headless browser represents a web browser environment devoid of a graphical user interface. These environments enable remote, automated manipulation of web pages, mimicking actual browser execution via command-line interfaces or network protocols. They are exceptionally valuable for rigorous web application validation, as they accurately process and interpret rendering cues—such as structural layout, chromatic schemes, font rendering, and the execution of dynamic scripting like JavaScript and Ajax—which are often inaccessible through alternative validation methodologies. Since the introduction of native remote control capabilities in Google Chrome (v59+) and Mozilla Firefox (v56+), older automation tools, like PhantomJS, have largely been superseded.
== Primary Application Scenarios == The core applications for headless browser technology revolve around:
- Automated testing cycles for contemporary web architectures (web validation).
- Programmatic capture of full-page screenshots.
- Execution of automated test suites for JavaScript frameworks.
- Scripted interaction with remote web interfaces.
=== Auxiliary Applications === Headless agents are also leveraged for sophisticated web data extraction (scraping). Google, for instance, acknowledged in 2009 that utilizing a headless agent aided their search indexation of sites heavily reliant on Ajax. Conversely, these tools have attracted misuse, including launching Distributed Denial of Service (DDoS) campaigns, artificially inflating advertisement metrics, and automating unintended site interactions (e.g., credential stuffing). Despite these negative uses, a 2018 traffic analysis suggested that malicious actors do not disproportionately favor headless environments over traditional ones for executing attacks like DDoS, SQL injection, or Cross-Site Scripting (XSS).
== Implementation Modalities == Given that several premier browser engines now natively expose headless functionality via dedicated APIs, various software wrappers exist to standardize interaction. Notable examples include:
- Selenium WebDriver: Adheres to W3C WebDriver specifications.
- Playwright: A library supporting automation across Chromium, Firefox, and WebKit engines.
- Puppeteer: A library focused on controlling Chrome or Firefox instances.
=== Validation Engineering === Certain software suites and testing frameworks integrate headless browsing as a foundational component of their validation apparatus.
- Capybara: Employs headless browsing (via WebKit or Headless Chrome) to simulate end-user actions within its testing protocols.
- Jasmine: Defaults to Selenium but can interface with WebKit or Headless Chrome for browser tests.
- Cypress: A specialized framework for frontend validation.
- QF-Test: A commercial tool for GUI-based testing that supports headless browser execution.
=== Alternative Methodologies === An alternative approach involves employing software that exposes direct browser-like APIs. For example, Deno integrates browser APIs into its core design. For Node.js environments, jsdom is the most comprehensive emulation provider. While these alternatives generally support fundamental browser features (HTML parsing, cookie management, XHR calls, basic JavaScript), they typically lack full DOM rendering capabilities and possess limited event handling support, often resulting in faster execution times than full headless rendering.
