logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

android-device-commander

Interface with an Android handset via ADB utilities to execute actions such as initiating calls, dispatching text messages, managing contact directories, and scripting various phone functionalities remotely from a host machine.

Author

android-device-commander logo

hao-cyber

Apache License 2.0

Quick Info

GitHub GitHub Stars 167
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

apisandroidhttpphone mcpcontrol androidadb commands

🤖 Android Device Control Hub (ADB-based)

Downloads

🌟 An advanced Management Control Plugin (MCP) enabling seamless remote manipulation of an Android handset utilizing the Android Debug Bridge (ADB) protocol.

Illustrative Scenarios

  • Based on current web-retrieved meteorological data, automatically initiate playback of Netease Music tracks without requiring manual confirmation. play_mucic_x2

  • Attempt to place an outgoing telephone call to the contact designated "Hao." Should the connection fail to establish, automatically compose and transmit an SMS directing him to convene at Conference Suite 101. call_sms_x2

中文指南

⚡ Rapid Deployment Guide

📥 Setup Procedure

# Direct execution via uvx (recommended, integrated with uv, no standalone installation needed)
uvx phone-mcp

# Or install using uv package manager
uv pip install phone-mcp

# Or install via standard pip
pip install phone-mcp

🔧 Configuration for Orchestration Tools

AI Agent Integration Setup

Configure within your chosen AI orchestration platform (e.g., Cursor, Trae, Claude):

{
    "mcpServers": {
        "android-device-commander": {
            "command": "uvx",
            "args": [
                "phone-mcp"
            ]
        }
    }
}

Alternatively, if installed via pip:

{
    "mcpServers": {
        "android-device-commander": {
            "command": "/usr/local/bin/python",
            "args": [
                "-m",
                "phone_mcp"
            ]
        }
    }
}

Crucial Note: The path /usr/local/bin/python in the alternative configuration denotes the Python executable location. This must be adjusted to match your system's actual interpreter location. Reference the following commands to locate it:

Linux/macOS Terminal: bash which python3 or bash which python

Windows Command Prompt (CMD): cmd where python

Windows PowerShell: powershell (Get-Command python).Path

Replace /usr/local/bin/python with the discovered full path, e.g., C:\Python39\python.exe on Windows.

Hint: For the Cursor IDE, integrate this configuration into ~/.cursor/mcp.json.

Interaction Example (via Claude chat interface): Initiate a voice connection to contact 'hao'

⚠️ Pre-operation Checklist: - Ensure ADB utilities are fully installed and correctly path-configured. - Verify that USB Debugging is activated on the target Android unit. - Confirm the device is physically linked to the computer via USB.

🎯 Core Capabilities

  • 📞 Telephony Operations: Initiate outgoing calls, terminate active sessions, handle incoming call alerts.
  • 💬 Text Messaging: Dispatch and retrieve Short Message Service (SMS) messages, access raw message stores.
  • 👥 Directory Access: Query stored phone contacts, programmatically generate new entries via automated interface manipulation.
  • 📸 Multimedia: Capture screen images, record screen sessions, manage media playback controls.
  • 📱 Application Management: Start applications, invoke specific activities using Intents, list all installed packages, force-close running processes.
  • 🔧 System Utilities: Retrieve active window metadata, access application launch shortcuts.
  • 🗺️ Geospatial: Query points of interest (POIs) associated with specific geographic coordinates or phone numbers.
  • 🖱️ User Interface (UI) Simulation: Execute virtual taps, execute swipe gestures, inject typed input, simulate hardware key presses.
  • 🔍 UI Element Discovery: Locate interactive components using textual labels, unique identifiers (ID), class names, or accessibility descriptions.
  • 🤖 Advanced UI Control: Implement pauses waiting for element presence, perform iterative scrolling to locate off-screen components.
  • 🧠 Screen State Interpretation: Obtain serialized screen structure and provide standardized interaction methods.
  • 🌐 Web Navigation: Instruct the device's default browser to load specified Uniform Resource Locators (URLs).
  • 🔄 Dynamic UI Polling: Continuously monitor screen modifications and wait for the appearance or recession of designated visual elements.

🛠️ System Prerequisites

  • Python Interpreter version 3.7 or newer.
  • An Android handset with USB Debugging mode enabled.
  • ADB toolchain installed.

📋 Fundamental Command Set

Device Status & Connectivity

# Verify device linkage status
phone-cli check

# Obtain screen dimensions/resolution details
phone-cli screen-interact find method=clickable

Communication Functions

# Initiate a voice connection
phone-cli call 1234567890

# Terminate current active call
phone-cli hangup

# Dispatch an SMS
phone-cli send-sms 1234567890 "Greetings"

# Retrieve recent incoming messages (supports paging)
phone-cli messages --limit 10

# Retrieve sent messages (supports paging)
phone-cli sent-messages --limit 10

# List stored phonebook entries (supports paging)
phone-cli contacts --limit 20

# Automate new contact creation via UI traversal
phone-cli create-contact "Jane Smith" "9876543210"

Media & Application Control

# Capture a static image of the current display
phone-cli screenshot

# Initiate screen capture recording
phone-cli record --duration 30

# Launch the default camera application
phone-cli app camera

# Alternate launch method if 'app' command fails
phone-cli open_app camera

# Force process termination for the camera application
phone-cli close-app com.android.camera

# Retrieve a list of installed packages (basic summary)
phone-cli list-apps

# Paginated, detailed package listing
phone-cli list-apps --page 1 --page-size 10

# Retrieve comprehensive package metadata
phone-cli list-apps --detailed

# Launch a specific component/activity (highly dependable method)
phone-cli launch com.android.settings/.Settings

# Start application using its package identifier
phone-cli app com.android.contacts

# Secondary launch attempt by package identifier
phone-cli open_app com.android.contacts

# Most robust launch method using full component path
phone-cli launch com.android.dialer/com.android.dialer.DialtactsActivity

# Instruct the default web browser to navigate to a URL
phone-cli open-url google.com

UI Interpretation & Interaction

# Obtain a structured data representation of the current screen state
phone-cli analyze-screen

# Unified interface for touch/gesture execution
phone-cli screen-interact <operation> [arguments]

# Simulate a tap event at specific screen coordinates
phone-cli screen-interact tap x=500 y=800

# Simulate a tap on an element identified by its displayed text label
phone-cli screen-interact tap element_text="Authenticate"

# Simulate a tap on an element identified by its content description attribute
phone-cli screen-interact tap element_content_desc="Navigation Drawer Toggle"

# Execute a swipe gesture (e.g., scrolling upward)
phone-cli screen-interact swipe x1=500 y1=1000 x2=500 y2=200 duration=300

# Simulate a hardware key press (e.g., Back button)
phone-cli screen-interact key keycode=back

# Inject text input into the focused field
phone-cli screen-interact text content="Secure Password"

# Search for an element based on properties
phone-cli screen-interact find method=text value="Submit" partial=true

# Block execution until an element is visible within a timeout
phone-cli screen-interact wait method=text value="Operation Complete" timeout=10

# Scroll the view until a target element is brought into visibility
phone-cli screen-interact scroll method=text value="Terms and Conditions" direction=down max_swipes=5

# Activate continuous UI state observation
phone-cli monitor-ui --interval 0.5 --duration 30

# Observe UI until specific text materializes
phone-cli monitor-ui --watch-for text_appears --text "Welcome Aboard"

# Observe UI until an element with a specific ID becomes present
phone-cli monitor-ui --watch-for id_appears --id "checkout_button"

# Observe UI until an element belonging to a specific class becomes present
phone-cli monitor-ui --watch-for class_appears --class-name "android.widget.Button"

# Configure UI observation to output raw JSON data on change
phone-cli monitor-ui --raw

Location & Geospatial Queries

# Query for nearby Points of Interest (e.g., restaurants) within a radius
phone-cli get-poi 116.480053,39.987005 --keywords restaurant --radius 1000

📚 Advanced Operational Details

Application and Activity Initiation

The plugin offers several distinct mechanisms for starting applications and their components:

  1. By Human-Readable App Name (Two Variants): ```bash # Variant 1: Direct command (might fail on restricted systems) phone-cli app camera

# Variant 2: Alternative fallback command phone-cli open_app camera ```

  1. By System Package Identifier (Two Variants): ```bash # Variant 1: Direct command (might fail on restricted systems) phone-cli app com.android.contacts

# Variant 2: Alternative fallback command phone-cli open_app com.android.contacts ```

  1. By Full Component Path (Most Reliable): bash # Guaranteed compatibility across most devices phone-cli launch com.android.dialer/com.android.dialer.DialtactsActivity

Guidance: If you encounter operational failures with the simpler app or open_app invocations, resort immediately to the launch command supplying the complete package/activity structure for maximal execution reliability.

Contact Provisioning via UI Automation

The utility supports automated creation of new contacts leveraging UI scripting:

# Construct a new contact record automatically
phone-cli create-contact "Alice Johnson" "555-1234"

This sequence automatically performs: 1. Opening the Contacts application interface. 2. Navigating to the data entry screen. 3. Populating the Name and Telephone number fields. 4. Submitting/Saving the new entry.

Screen-Contextual Automation

The standardized screen interaction module empowers intelligent agents to:

  1. Analyze Displays: Receive structured data describing visible UI components and associated text content.
  2. Formulate Strategy: Make execution choices based on observed UI patterns and available action sets.
  3. Execute Interactions: Perform actions using a uniform parameter set syntax.

UI State Monitoring and Response

The plugin furnishes robust mechanisms for tracking interface evolution:

  1. Basic State Tracking: bash # Observe all UI transitions at a specified frequency (in seconds) phone-cli monitor-ui --interval 0.5 --duration 30

  2. Waiting for Specific Element Manifestation: ```bash # Halt execution until target text becomes visible (useful for validation/testing) phone-cli monitor-ui --watch-for text_appears --text "Login successful"

# Halt execution until a known element identifier loads phone-cli monitor-ui --watch-for id_appears --id "user_profile_icon" ```

  1. Monitoring Element Recession: bash # Wait until transient text (like a spinner) vanishes phone-cli monitor-ui --watch-for text_disappears --text "Loading in progress..."

  2. Obtaining Detailed Change Logs: bash # Capture comprehensive change data in JSON format phone-cli monitor-ui --raw

Pro Tip: UI monitoring is indispensable for automated scripts needing to confirm that asynchronous processes (like loading screens) have successfully completed their work on the graphical interface.

📚 Comprehensive API Reference

For the full scope of documentation and configuration parameters, consult the project's main GitHub Repository.

🧰 Internal Tool Documentation

Screen Interface API

The plugin exposes a comprehensive API layer for screen manipulation and data extraction.

interact_with_screen

async def interact_with_screen(action: str, params: Dict[str, Any] = None) -> str:
    """Dispatches screen manipulation routines"""
  • Arguments:
  • action: Routine type ("tap", "swipe", "key", "text", "find", "wait", "scroll")
  • params: Specific parameters dictionary contingent on the selected action.
  • Output: A JSON-formatted string detailing the outcome of the operation.

Operational Examples:

# Coordinate-based screen tap
result = await interact_with_screen("tap", {"x": 100, "y": 200})

# Tap based on visible label
result = await interact_with_screen("tap", {"element_text": "Proceed to Checkout"})

# Vertical swipe action (e.g., scrolling up)
result = await interact_with_screen("swipe", {"x1": 500, "y1": 300, "x2": 500, "y2": 1200, "duration": 300})

# Text insertion
result = await interact_with_screen("text", {"content": "Verified Input Data"})

# System key simulation
result = await interact_with_screen("key", {"keycode": "home"})

# Locate element by its accessible name
result = await interact_with_screen("find", {"method": "text", "value": "Settings", "partial": True})

# Wait for element confirmation
result = await interact_with_screen("wait", {"method": "id", "value": "error_dialog", "timeout": 15, "interval": 0.25})

# Iterative scroll operation to uncover content
result = await interact_with_screen("scroll", {"method": "description", "value": "Help Button", "direction": "up", "max_swipes": 10})

analyze_screen

async def analyze_screen(include_screenshot: bool = False, max_elements: int = 50) -> str:
    """Generates a structured data model of current screen constituents"""
  • Arguments:
  • include_screenshot: Boolean flag to embed a base64 image of the screen in the output.
  • max_elements: Constraint on the count of UI nodes to parse.
  • Output: JSON string containing the detailed screen layout analysis.

create_contact

async def create_contact(name: str, phone: str) -> str:
    """Programmatically establishes a new contact entry"""
  • Arguments:
  • name: Full designation for the new contact.
  • phone: The associated telephone number.
  • Output: JSON string confirming operation status.
  • Source Module: contacts.py handles the underlying UI automation for persistence.

launch_app_activity

async def launch_app_activity(package_name: str, activity_name: Optional[str] = None) -> str:
    """Initiates an application instance via package and optional activity path"""
  • Arguments:
  • package_name: Unique identifier for the application package.
  • activity_name: Specific entry point within the package (optional).
  • Output: JSON string confirming operation status.
  • Source Module: apps.py manages this capability.

launch_intent

async def launch_intent(intent_action: str, intent_type: Optional[str] = None, extras: Optional[Dict[str, str]] = None) -> str:
    """Triggers an Android component using the Intent broadcast mechanism"""
  • Arguments:
  • intent_action: The required Android action identifier.
  • intent_type: The MIME type specification (optional).
  • extras: Supplemental data payload for the intent (optional).
  • Output: JSON string confirming operation status.
  • Source Module: apps.py implements this low-level intent firing.

📄 Licensing

Distributed under the terms of the Apache License, Version 2.0.

Contact Record Creation Utility

This utility simplifies the process of provisioning new entries into the Android device's address book using ADB.

Prerequisites

  • Python environment (version 3.x).
  • The Android Debug Bridge (ADB) executable must be installed and accessible in the system PATH.
  • The target Android apparatus must be physically linked and authorized for debugging sessions.

Execution

Standard Invocation

Execute the dedicated Python script:

python create_contact.py

This command defaults to creating a contact using pre-set values: - Account Identifier: "Your Account Name" - Account Type: "com.google"

Advanced Parameterization

You possess the ability to override default account details by supplying a JSON string argument:

python create_contact.py '{"account_name": "work_profile", "account_type": "com.work.account"}'

Operational Feedback

The script yields a JSON object detailing the outcome: - success: Boolean indicating successful execution. - message: Any supplementary information or captured error diagnostics.

Example of a successful return payload:

{"success": true, "message": "Contact added successfully"}

Diagnostic Considerations

  • Failure to locate or communicate with ADB, or an unauthorized device state, will result in an operational error report.
  • Malformed JSON input will trigger an input validation error.
  • Any errors encountered during the execution of the underlying ADB commands will be logged and returned within the message field.

Observations

  • Ensure the Android unit is unlocked and ready to receive commands.
  • Certain device security policies may necessitate elevated permissions for successful contact modification.

Application Shortcuts

# Retrieve declared deep links or shortcuts for a specific package
phone-cli shortcuts --package "com.example.app"

See Also

`