android-device-commander
Interface with an Android handset via ADB utilities to execute actions such as initiating calls, dispatching text messages, managing contact directories, and scripting various phone functionalities remotely from a host machine.
Author

hao-cyber
Quick Info
Actions
Tags
🤖 Android Device Control Hub (ADB-based)
🌟 An advanced Management Control Plugin (MCP) enabling seamless remote manipulation of an Android handset utilizing the Android Debug Bridge (ADB) protocol.
Illustrative Scenarios
-
Based on current web-retrieved meteorological data, automatically initiate playback of Netease Music tracks without requiring manual confirmation.
-
Attempt to place an outgoing telephone call to the contact designated "Hao." Should the connection fail to establish, automatically compose and transmit an SMS directing him to convene at Conference Suite 101.
中文指南
⚡ Rapid Deployment Guide
📥 Setup Procedure
# Direct execution via uvx (recommended, integrated with uv, no standalone installation needed)
uvx phone-mcp
# Or install using uv package manager
uv pip install phone-mcp
# Or install via standard pip
pip install phone-mcp
🔧 Configuration for Orchestration Tools
AI Agent Integration Setup
Configure within your chosen AI orchestration platform (e.g., Cursor, Trae, Claude):
{
"mcpServers": {
"android-device-commander": {
"command": "uvx",
"args": [
"phone-mcp"
]
}
}
}
Alternatively, if installed via pip:
{
"mcpServers": {
"android-device-commander": {
"command": "/usr/local/bin/python",
"args": [
"-m",
"phone_mcp"
]
}
}
}
Crucial Note: The path
/usr/local/bin/pythonin the alternative configuration denotes the Python executable location. This must be adjusted to match your system's actual interpreter location. Reference the following commands to locate it:Linux/macOS Terminal:
bash which python3orbash which pythonWindows Command Prompt (CMD):
cmd where pythonWindows PowerShell:
powershell (Get-Command python).PathReplace
/usr/local/bin/pythonwith the discovered full path, e.g.,C:\Python39\python.exeon Windows.Hint: For the Cursor IDE, integrate this configuration into
~/.cursor/mcp.json.
Interaction Example (via Claude chat interface):
Initiate a voice connection to contact 'hao'
⚠️ Pre-operation Checklist: - Ensure ADB utilities are fully installed and correctly path-configured. - Verify that USB Debugging is activated on the target Android unit. - Confirm the device is physically linked to the computer via USB.
🎯 Core Capabilities
- 📞 Telephony Operations: Initiate outgoing calls, terminate active sessions, handle incoming call alerts.
- 💬 Text Messaging: Dispatch and retrieve Short Message Service (SMS) messages, access raw message stores.
- 👥 Directory Access: Query stored phone contacts, programmatically generate new entries via automated interface manipulation.
- 📸 Multimedia: Capture screen images, record screen sessions, manage media playback controls.
- 📱 Application Management: Start applications, invoke specific activities using Intents, list all installed packages, force-close running processes.
- 🔧 System Utilities: Retrieve active window metadata, access application launch shortcuts.
- 🗺️ Geospatial: Query points of interest (POIs) associated with specific geographic coordinates or phone numbers.
- 🖱️ User Interface (UI) Simulation: Execute virtual taps, execute swipe gestures, inject typed input, simulate hardware key presses.
- 🔍 UI Element Discovery: Locate interactive components using textual labels, unique identifiers (ID), class names, or accessibility descriptions.
- 🤖 Advanced UI Control: Implement pauses waiting for element presence, perform iterative scrolling to locate off-screen components.
- 🧠 Screen State Interpretation: Obtain serialized screen structure and provide standardized interaction methods.
- 🌐 Web Navigation: Instruct the device's default browser to load specified Uniform Resource Locators (URLs).
- 🔄 Dynamic UI Polling: Continuously monitor screen modifications and wait for the appearance or recession of designated visual elements.
🛠️ System Prerequisites
- Python Interpreter version 3.7 or newer.
- An Android handset with USB Debugging mode enabled.
- ADB toolchain installed.
📋 Fundamental Command Set
Device Status & Connectivity
# Verify device linkage status
phone-cli check
# Obtain screen dimensions/resolution details
phone-cli screen-interact find method=clickable
Communication Functions
# Initiate a voice connection
phone-cli call 1234567890
# Terminate current active call
phone-cli hangup
# Dispatch an SMS
phone-cli send-sms 1234567890 "Greetings"
# Retrieve recent incoming messages (supports paging)
phone-cli messages --limit 10
# Retrieve sent messages (supports paging)
phone-cli sent-messages --limit 10
# List stored phonebook entries (supports paging)
phone-cli contacts --limit 20
# Automate new contact creation via UI traversal
phone-cli create-contact "Jane Smith" "9876543210"
Media & Application Control
# Capture a static image of the current display
phone-cli screenshot
# Initiate screen capture recording
phone-cli record --duration 30
# Launch the default camera application
phone-cli app camera
# Alternate launch method if 'app' command fails
phone-cli open_app camera
# Force process termination for the camera application
phone-cli close-app com.android.camera
# Retrieve a list of installed packages (basic summary)
phone-cli list-apps
# Paginated, detailed package listing
phone-cli list-apps --page 1 --page-size 10
# Retrieve comprehensive package metadata
phone-cli list-apps --detailed
# Launch a specific component/activity (highly dependable method)
phone-cli launch com.android.settings/.Settings
# Start application using its package identifier
phone-cli app com.android.contacts
# Secondary launch attempt by package identifier
phone-cli open_app com.android.contacts
# Most robust launch method using full component path
phone-cli launch com.android.dialer/com.android.dialer.DialtactsActivity
# Instruct the default web browser to navigate to a URL
phone-cli open-url google.com
UI Interpretation & Interaction
# Obtain a structured data representation of the current screen state
phone-cli analyze-screen
# Unified interface for touch/gesture execution
phone-cli screen-interact <operation> [arguments]
# Simulate a tap event at specific screen coordinates
phone-cli screen-interact tap x=500 y=800
# Simulate a tap on an element identified by its displayed text label
phone-cli screen-interact tap element_text="Authenticate"
# Simulate a tap on an element identified by its content description attribute
phone-cli screen-interact tap element_content_desc="Navigation Drawer Toggle"
# Execute a swipe gesture (e.g., scrolling upward)
phone-cli screen-interact swipe x1=500 y1=1000 x2=500 y2=200 duration=300
# Simulate a hardware key press (e.g., Back button)
phone-cli screen-interact key keycode=back
# Inject text input into the focused field
phone-cli screen-interact text content="Secure Password"
# Search for an element based on properties
phone-cli screen-interact find method=text value="Submit" partial=true
# Block execution until an element is visible within a timeout
phone-cli screen-interact wait method=text value="Operation Complete" timeout=10
# Scroll the view until a target element is brought into visibility
phone-cli screen-interact scroll method=text value="Terms and Conditions" direction=down max_swipes=5
# Activate continuous UI state observation
phone-cli monitor-ui --interval 0.5 --duration 30
# Observe UI until specific text materializes
phone-cli monitor-ui --watch-for text_appears --text "Welcome Aboard"
# Observe UI until an element with a specific ID becomes present
phone-cli monitor-ui --watch-for id_appears --id "checkout_button"
# Observe UI until an element belonging to a specific class becomes present
phone-cli monitor-ui --watch-for class_appears --class-name "android.widget.Button"
# Configure UI observation to output raw JSON data on change
phone-cli monitor-ui --raw
Location & Geospatial Queries
# Query for nearby Points of Interest (e.g., restaurants) within a radius
phone-cli get-poi 116.480053,39.987005 --keywords restaurant --radius 1000
📚 Advanced Operational Details
Application and Activity Initiation
The plugin offers several distinct mechanisms for starting applications and their components:
- By Human-Readable App Name (Two Variants): ```bash # Variant 1: Direct command (might fail on restricted systems) phone-cli app camera
# Variant 2: Alternative fallback command phone-cli open_app camera ```
- By System Package Identifier (Two Variants): ```bash # Variant 1: Direct command (might fail on restricted systems) phone-cli app com.android.contacts
# Variant 2: Alternative fallback command phone-cli open_app com.android.contacts ```
- By Full Component Path (Most Reliable):
bash # Guaranteed compatibility across most devices phone-cli launch com.android.dialer/com.android.dialer.DialtactsActivity
Guidance: If you encounter operational failures with the simpler
apporopen_appinvocations, resort immediately to thelaunchcommand supplying the complete package/activity structure for maximal execution reliability.
Contact Provisioning via UI Automation
The utility supports automated creation of new contacts leveraging UI scripting:
# Construct a new contact record automatically
phone-cli create-contact "Alice Johnson" "555-1234"
This sequence automatically performs: 1. Opening the Contacts application interface. 2. Navigating to the data entry screen. 3. Populating the Name and Telephone number fields. 4. Submitting/Saving the new entry.
Screen-Contextual Automation
The standardized screen interaction module empowers intelligent agents to:
- Analyze Displays: Receive structured data describing visible UI components and associated text content.
- Formulate Strategy: Make execution choices based on observed UI patterns and available action sets.
- Execute Interactions: Perform actions using a uniform parameter set syntax.
UI State Monitoring and Response
The plugin furnishes robust mechanisms for tracking interface evolution:
-
Basic State Tracking:
bash # Observe all UI transitions at a specified frequency (in seconds) phone-cli monitor-ui --interval 0.5 --duration 30 -
Waiting for Specific Element Manifestation: ```bash # Halt execution until target text becomes visible (useful for validation/testing) phone-cli monitor-ui --watch-for text_appears --text "Login successful"
# Halt execution until a known element identifier loads phone-cli monitor-ui --watch-for id_appears --id "user_profile_icon" ```
-
Monitoring Element Recession:
bash # Wait until transient text (like a spinner) vanishes phone-cli monitor-ui --watch-for text_disappears --text "Loading in progress..." -
Obtaining Detailed Change Logs:
bash # Capture comprehensive change data in JSON format phone-cli monitor-ui --raw
Pro Tip: UI monitoring is indispensable for automated scripts needing to confirm that asynchronous processes (like loading screens) have successfully completed their work on the graphical interface.
📚 Comprehensive API Reference
For the full scope of documentation and configuration parameters, consult the project's main GitHub Repository.
🧰 Internal Tool Documentation
Screen Interface API
The plugin exposes a comprehensive API layer for screen manipulation and data extraction.
interact_with_screen
async def interact_with_screen(action: str, params: Dict[str, Any] = None) -> str:
"""Dispatches screen manipulation routines"""
- Arguments:
action: Routine type ("tap", "swipe", "key", "text", "find", "wait", "scroll")params: Specific parameters dictionary contingent on the selected action.- Output: A JSON-formatted string detailing the outcome of the operation.
Operational Examples:
# Coordinate-based screen tap
result = await interact_with_screen("tap", {"x": 100, "y": 200})
# Tap based on visible label
result = await interact_with_screen("tap", {"element_text": "Proceed to Checkout"})
# Vertical swipe action (e.g., scrolling up)
result = await interact_with_screen("swipe", {"x1": 500, "y1": 300, "x2": 500, "y2": 1200, "duration": 300})
# Text insertion
result = await interact_with_screen("text", {"content": "Verified Input Data"})
# System key simulation
result = await interact_with_screen("key", {"keycode": "home"})
# Locate element by its accessible name
result = await interact_with_screen("find", {"method": "text", "value": "Settings", "partial": True})
# Wait for element confirmation
result = await interact_with_screen("wait", {"method": "id", "value": "error_dialog", "timeout": 15, "interval": 0.25})
# Iterative scroll operation to uncover content
result = await interact_with_screen("scroll", {"method": "description", "value": "Help Button", "direction": "up", "max_swipes": 10})
analyze_screen
async def analyze_screen(include_screenshot: bool = False, max_elements: int = 50) -> str:
"""Generates a structured data model of current screen constituents"""
- Arguments:
include_screenshot: Boolean flag to embed a base64 image of the screen in the output.max_elements: Constraint on the count of UI nodes to parse.- Output: JSON string containing the detailed screen layout analysis.
create_contact
async def create_contact(name: str, phone: str) -> str:
"""Programmatically establishes a new contact entry"""
- Arguments:
name: Full designation for the new contact.phone: The associated telephone number.- Output: JSON string confirming operation status.
- Source Module:
contacts.pyhandles the underlying UI automation for persistence.
launch_app_activity
async def launch_app_activity(package_name: str, activity_name: Optional[str] = None) -> str:
"""Initiates an application instance via package and optional activity path"""
- Arguments:
package_name: Unique identifier for the application package.activity_name: Specific entry point within the package (optional).- Output: JSON string confirming operation status.
- Source Module:
apps.pymanages this capability.
launch_intent
async def launch_intent(intent_action: str, intent_type: Optional[str] = None, extras: Optional[Dict[str, str]] = None) -> str:
"""Triggers an Android component using the Intent broadcast mechanism"""
- Arguments:
intent_action: The required Android action identifier.intent_type: The MIME type specification (optional).extras: Supplemental data payload for the intent (optional).- Output: JSON string confirming operation status.
- Source Module:
apps.pyimplements this low-level intent firing.
📄 Licensing
Distributed under the terms of the Apache License, Version 2.0.
Contact Record Creation Utility
This utility simplifies the process of provisioning new entries into the Android device's address book using ADB.
Prerequisites
- Python environment (version 3.x).
- The Android Debug Bridge (ADB) executable must be installed and accessible in the system PATH.
- The target Android apparatus must be physically linked and authorized for debugging sessions.
Execution
Standard Invocation
Execute the dedicated Python script:
python create_contact.py
This command defaults to creating a contact using pre-set values: - Account Identifier: "Your Account Name" - Account Type: "com.google"
Advanced Parameterization
You possess the ability to override default account details by supplying a JSON string argument:
python create_contact.py '{"account_name": "work_profile", "account_type": "com.work.account"}'
Operational Feedback
The script yields a JSON object detailing the outcome:
- success: Boolean indicating successful execution.
- message: Any supplementary information or captured error diagnostics.
Example of a successful return payload:
{"success": true, "message": "Contact added successfully"}
Diagnostic Considerations
- Failure to locate or communicate with ADB, or an unauthorized device state, will result in an operational error report.
- Malformed JSON input will trigger an input validation error.
- Any errors encountered during the execution of the underlying ADB commands will be logged and returned within the
messagefield.
Observations
- Ensure the Android unit is unlocked and ready to receive commands.
- Certain device security policies may necessitate elevated permissions for successful contact modification.
Application Shortcuts
# Retrieve declared deep links or shortcuts for a specific package
phone-cli shortcuts --package "com.example.app"
