logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

google-parallel-query-engine

Facilitates rapid, concurrent querying of the Google search engine utilizing multiple specified search terms, incorporating automated defense against verification checks and outputting organized results in JSON structure for downstream consumption.

Author

google-parallel-query-engine logo

jae-jae

MIT License

Quick Info

GitHub GitHub Stars 210
NPM Weekly Downloads 12269
Tools 1
Last Updated 2026-02-19

Tags

automationscrapinggooglebrowser automationautomation websearching scraping
g-search-mcp Logo

Google Parallel Query Engine (GPQE)

An advanced Master Control Program (MCP) server dedicated to high-throughput Google lookups, allowing simultaneous execution across numerous distinct search phrases.

This utility is derived from the original google-search project.

🌟 Highly Recommended Companion: OllaMan - For robust management of Ollama AI models.

Key Capabilities

  • Concurrency: Enables simultaneous execution of searches across a range of input keywords against Google, significantly boosting acquisition speed.
  • Browser Efficiency: Leverages a single browser instance to manage numerous concurrent tabs for streamlined parallel fetching.
  • Challenge Mitigation: Smartly identifies and intercepts security prompts (like CAPTCHAs), switching to visible browser mode only when mandatory user interaction for validation is required.
  • User Emulation: Implements patterns mimicking genuine human browsing activities to minimize rate-limiting or blocking by search indexers.
  • Standardized Output: Delivers the gathered search data in a machine-readable JSON format, simplifying subsequent analytical pipelines.
  • Adaptable Settings: Allows fine-tuning of operational parameters, including result counts per query, data retrieval timeouts, and search locale preferences.

Initial Deployment

Execute immediately via npx:

npx -y g-search-mcp

For the first use, ensure the necessary browser automation binaries are installed in your terminal:

npx playwright install chromium

Diagnostic Mode

Invoke with the --debug flag to operate in a visible window mode:

npx -y g-search-mcp --debug

Configuring the MCP Interface

Integrate this server within your Claude Desktop configuration:

MacOS Path: ~/Library/Application Support/Claude/claude_desktop_config.json Windows Path: %APPDATA%/Claude/claude_desktop_config.json

Configuration Snippet:

{
  "mcpServers": {
    "g-search": {
      "command": "npx",
      "args": ["-y", "g-search-mcp"]
    }
  }
}

Available Functionality

  • search - Initiates Google lookups based on an array of input strings, returning structured results.
  • Utilizes the Playwright browser environment for execution.
  • Supports the subsequent configuration arguments:
    • queries: Mandatory array defining the search phrases to process.
    • limit: Maximum records retrieved per query; defaults to 10.
    • timeout: Maximum allowable time (in milliseconds) for page loading; defaults to 60000 (1 minute).
    • noSaveState: Boolean flag to prevent saving browser session data; defaults to false.
    • locale: Specifies the geographical/language setting for search results; defaults to en-US.
    • debug: Overrides command-line settings to force display of the browser interface.

Usage Prompt Example:

Utilize the search utility to investigate "machine learning" and "artificial intelligence" on Google

Expected Output Structure:

{
  "searches": [
    {
      "query": "machine learning",
      "results": [
        {
          "title": "What is Machine Learning? | IBM",
          "link": "https://www.ibm.com/topics/machine-learning",
          "snippet": "Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy."
        },
        ...
      ]
    },
    {
      "query": "artificial intelligence",
      "results": [
        {
          "title": "What is Artificial Intelligence (AI)? | IBM",
          "link": "https://www.ibm.com/topics/artificial-intelligence",
          "snippet": "Artificial intelligence leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind."
        },
        ...
      ]
    }
  ]
}

Operational Guidance

Addressing Specific Site Complexities

Modifying Query Parameters

  • Result Volume Control: To obtain a larger set of findings:

Request the top 20 search results for every input term.

This action configures the limit parameter to 20.

  • Timeout Extension: For scenarios involving slow network response: Extend the page loading timeout duration to 120 seconds. This sets the timeout parameter to 120000 milliseconds.

Regional Search Configuration

  • Geographic Context Shift: To target results from a specific area: Execute searches using the German locale setting (de-DE). This sets the locale: "de-DE" parameter.

Troubleshooting and Visibility

Activating Debug Mode

  • On-Demand Visualization: To make the browser window appear for a particular operational sequence: Activate visual rendering mode for this specific search execution. This forces the debug: true setting, regardless of the initial server launch configuration.

Prerequisites for Installation

  • Runtime environment: Node.js version 18 or newer
  • Package manager: NPM or Yarn

Local Source Compilation

  1. Obtain the source code repository:
git clone https://github.com/jae-jae/g-search-mcp.git
cd g-search-mcp
  1. Install required libraries:
npm install
  1. Provision the necessary browser engine:
npm run install-browser
  1. Compile the executable server assets:
npm run build

Development Flow

Continuous Rebuild (Development State)

npm run watch

Debugging with MCP Inspector

npm run inspector

Associated Tools

  • fetcher-mcp: An efficient MCP service for retrieving raw web page data using Playwright's capabilities. It features intelligent content extraction logic, parallel tasking, resource optimization, making it a prime utility for automated web data acquisition.

Distributed under the terms of the MIT License

WIKIPEDIA: A headless browser operates without a graphical interface, functioning instead through programmatic control. These environments are invaluable for automated tasks like testing, as they faithfully process HTML, CSS, and JavaScript just like a standard browser, but are managed via command line or network interface. Native headless support has been incorporated into major browsers (Chrome/Firefox) since specific versions, rendering older external emulation tools less necessary. The primary applications for headless browsing include rigorous web application testing, generating page snapshots, executing JS library checks, and automating complex page interactions.

== Core Applications == The principal uses for environments lacking a GUI include:

Verification workflows for modern web platforms (QA testing) Automated rendering of full-page screenshots. Execution environments for front-end JavaScript frameworks. Programmatic manipulation of web page elements.

=== Secondary Utilities === Headless utilities are also frequently employed in large-scale data harvesting (web scraping). Google has previously noted their usefulness for indexing sites reliant on dynamic content (Ajax). Conversely, misuse exists, such as generating artificial traffic or automating unauthorized interactions (credential stuffing). However, contemporary traffic analysis does not show a strong correlation between malicious activity and the use of headless agents versus standard browser agents.

== Implementation Landscape == Given that current flagship browsers offer built-in headless APIs, several unified automation frameworks have emerged to interface with them:

Selenium WebDriver – Follows the W3C WebDriver specification. Playwright – A comprehensive library supporting Chromium, Firefox, and WebKit automation from Node.js. Puppeteer – Focused primarily on automating Chromium/Chrome instances.

=== Testing Framework Integration === Many testing suites incorporate headless capabilities into their core setup:

Capybara frequently employs Headless Chrome or WebKit to simulate user paths during protocol validation. Jasmine typically defaults to Selenium but can be configured to use headless WebKit or Chrome for its browser tests. Cypress – A dedicated framework for front-end testing. QF-Test – A tool supporting GUI-based automated testing, often utilizing a headless browser instance.

=== Non-Browser Alternatives === An alternative strategy involves using libraries that emulate browser APIs within a runtime environment. For instance, Deno natively includes browser APIs. In the Node.js ecosystem, jsdom provides the most extensive emulation, covering HTML parsing, cookie management, XHR, and basic JavaScript execution. While these libraries are fast, they generally lack full DOM rendering capabilities and associated event handling compared to true headless browsers.

See Also

`