apisix-gateway
High-throughput API gateway for managing and directing API traffic, featuring advanced capabilities like dynamic routing, intelligent load distribution, and comprehensive security measures. It offers robust proxy services and granular rate limiting, critical for scaling and maintaining the performance of modern AI workloads.
Author

apache
Quick Info
Actions
Tags
Apache APISIX API Gateway | Next-Generation AI Gateway Core
Apache APISIX stands as a dynamic, low-latency, and high-throughput API entry point.
APISIX furnishes extensive traffic manipulation capabilities, encompassing load balancing across diverse upstreams, dynamic configuration updates, circuit interruption mechanisms, advanced authentication protocols, observability tools, and much more.
Leveraging its adaptable plugin architecture, APISIX excels as an AI Gateway, facilitating specialized proxying for AI models, optimized load distribution for Large Language Models (LLMs), automated retries and fallback logic, token-based throttling, and stringent security enforcement to guarantee the efficacy and resilience of AI-driven services. Furthermore, the mcp-bridge plugin enables the conversion of standard stdio-based MCP servers into scalable HTTP SSE services.
You can deploy APISIX API Gateway to manage conventional ingress (north-south) traffic, as well as inter-service (east-west) communications. It is also fully capable of operating as a dedicated k8s ingress controller.
The underlying technical structure of Apache APISIX:
Community Engagement
- Please consider providing a Review for APISIX on G2.
- Mailing List: Subscribe by sending an email to dev-subscribe@apisix.apache.org and following the response instructions.
- Join our Slack Workspace - access link (If the link fails, please file an issue for assistance). Once joined, locate and participate in the #apisix channel (Channels -> Browse channels -> search for "apisix").
- Follow us on Twitter
- use the hashtag
#ApacheAPISIXto engage. - Consult the official Documentation
- Explore active Discussions
- Read the latest updates on the Blog
Core Capabilities
APISIX API Gateway functions as a unified traffic ingress point capable of processing diverse business data streams, including: dynamic routing rules, dynamic backend management, dynamic certificate loading, A/B testing frameworks, canary deployments, blue-green transitions, traffic limitation enforcement, defense against malicious intrusions, comprehensive metrics collection, monitoring alerts, service introspection, and robust service resiliency mechanisms.
-
Platform Compatibility
-
Cloud-Native Focus: Inherently platform-agnostic, avoiding vendor lock-in. APISIX API Gateway deploys efficiently from bare metal environments up to Kubernetes clusters.
-
ARM64 Architecture Support: Ensures compatibility across diverse infrastructure technologies.
-
Protocol Versatility
-
TCP/UDP Proxy: Provides dynamic proxying for stream protocols.
- Dubbo Proxy: Enables dynamic translation from HTTP to Dubbo protocols.
- Dynamic MQTT Proxy: Supports load balancing for MQTT traffic based on
client_id, covering both MQTT 3.1.* and 5.0 specifications. - gRPC Proxy: Direct proxying for gRPC traffic.
- gRPC Web Proxy: Handles proxying for gRPC Web clients to standard gRPC services.
- gRPC Transcoding: Implements protocol conversion, allowing HTTP/JSON clients to interact with gRPC APIs.
- Websocket Proxying
- Proxy Protocol Support
- HTTP(S) Forward Proxy functionality
- SSL Management: Allows dynamic injection of SSL certificates.
-
HTTP/3 with QUIC support.
-
Dynamic Configuration Management
-
Instant Configuration/Plugin Reloads: Modify configurations and plugins instantly without service interruption.
- Request Rewriting: Modify essential request elements (
host,uri,schema,method,headers) before forwarding to the target backend. - Response Manipulation: Define custom response status codes, body content, and headers returned to the client.
- Distributed Load Distribution: Supports weighted round-robin algorithms.
- Consistent Hashing Load Balancing: Facilitates session stickiness via hashing techniques.
- Automated Health Checks: Monitors backend health and automatically excludes failed nodes to maintain service stability.
- Circuit Breaking: Intelligently tracks and isolates intermittently failing upstream services.
- Request Duplication/Mirroring: Copies client requests for passive analysis or testing.
-
Traffic Splitting: Allows incremental routing of traffic percentages across multiple defined upstreams.
-
Precise Traffic Steering
-
Path Matching: Supports full path and prefix-based matching strategies.
- Route Conditioning: Utilizes all standard Nginx variables (e.g.,
cookie, request arguments) as criteria for routing decisions, enabling sophisticated patterns like canary releases. - Operator-Based Conditions: Supports complex filtering using various operators (e.g.,
{"arg_age", ">", 24}). - Custom Route Matching Functions: Ability to inject custom logic for route determination.
- IPv6 Support for route matching.
- TTL-based Route Management: Supports Time-To-Live settings for routes.
- Priority-based Routing.
- Batch HTTP Request Handling.
-
GraphQL Attribute Filtering.
-
Security Framework
-
Comprehensive Authentication & Authorization Mechanisms:
- API Key Authentication
- JSON Web Token (JWT)
- HTTP Basic Auth
- Role-Based Access Control (RBAC) via Wolf
- Authorization using Casbin
- Keycloak Integration
- Casdoor Integration
- Network Access Control: IP Whitelisting/Blacklisting
- Referer Validation: Whitelist/Blacklist
- Identity Provider (IdP) Support: Works with external identity providers like Auth0, Okta, etc.
- Traffic Throttling Plugins:
- Rate Limiting by Request
- Limiting by Count
- Limiting Concurrent Connections
- Anti-ReDoS Measures: Built-in defenses against Regular Expression Denial of Service attacks.
- Cross-Origin Resource Sharing (CORS) enablement.
- URI Blocking: Mechanism to deny access based on request URI patterns.
- Request Schema Validation
-
CSRF Protection: Implemented via the
Double Submit Cookiemethod. -
Operational Excellence (OPS)
-
Distributed Tracing: Supports Zipkin integration
- Open Source APM: Compatibility with Apache SkyWalking
- External Service Discovery: Beyond native etcd, it supports Consul, Consul_kv, Nacos, Eureka, and Zookeeper (CP).
- Metrics & Monitoring: Integration with Prometheus
- Clustering: APISIX instances are designed to be stateless; configuration clustering relies on external coordination (e.g., etcd Clustering Guide link).
- High Availability: Supports specifying multiple etcd endpoints for failover.
- Web-based Control Panel
- Configuration Rollback: Capability to revert to previous operational states.
- Command Line Interface (CLI): Tools for starting, stopping, and reloading the gateway process.
- Standalone Mode: Ability to load routing rules from local YAML files, simplifying deployment on environments like Kubernetes (k8s).
- Global Rules: Apply specific plugins (e.g., rate limiting, IP filtering) universally to all incoming traffic.
- Performance Benchmark: Achieves over 18k QPS on a single core with median latency under 0.2 milliseconds.
- Fault Injection Testing
- REST Admin Interface: Provides a RESTful API for real-time cluster management. By default, access is restricted to 127.0.0.1; the
allow_adminsetting inconf/config.yamlcontrols external access. Authentication requires API key verification. - External Log Forwarding: Export access records to various centralized systems (HTTP Logger, TCP Logger, Kafka Logger, UDP Logger, RocketMQ Logger, SkyWalking Logger, Alibaba Cloud SLS, Google Cloud Logging, Splunk HEC Logging, File Logger, SolarWinds Loggly, TencentCloud CLS).
- ClickHouse Integration: Direct log transmission to ClickHouse.
- Elasticsearch Integration: Direct log transmission to Elasticsearch.
- Datadog Metrics: Sends custom metrics via UDP to a DogStatsD server (often bundled with the Datadog agent).
- Helm Deployment Charts
-
HashiCorp Vault Integration: Secure secret retrieval from Vault. Currently supports referencing RS256 keys or secrets within the jwt-auth plugin via the APISIX Secret resource.
-
Extensibility and Scale
-
Custom Plugin Hooks: Extend functionality across key processing stages:
rewrite,access,header filter,body filter,log, and thebalancerphase. - Multi-Language Plugin Support: Plugins can be developed in Java, Go, or Python using external runners via RPC.
- Proxy Wasm SDK Support: Run plugins compiled to Wasm bytecode.
- Custom Load Balancing: Implement proprietary algorithms callable during the
balancerstage. -
Custom Routing Logic: Users can define their own traffic steering algorithms.
-
Multi-Language Gateway Functionality
-
Apache APISIX supports polyglot plugin development, primarily utilizing
RPCfor process isolation orWasmfor in-process execution. -
RPC Approach (Established Method): Developers select their preferred language, launch an isolated process exposing an RPC interface, and APISIX communicates with it locally. Current language runner support includes Java, Golang, Python, and Node.js.
-
Wasm Approach (Experimental): APISIX can execute Wasm bytecode through its dedicated wasm plugin, following the specifications of the Proxy Wasm SDK. Code is written against the SDK and compiled into runnable Wasm bytecode within the APISIX runtime.
-
Serverless Integration
-
Lua Functions: Execute custom Lua code hooks at various request processing phases.
- AWS Lambda: Proxy requests to AWS Lambda functions acting as dynamic backends, with support for IAM and API Key authentication.
- Azure Functions: Seamless integration to direct traffic to Azure Serverless Functions.
- Apache OpenWhisk: Direct routing to a self-hosted or managed OpenWhisk cluster.
Deployment Guide
- Installation Procedures
Consult the comprehensive installation documentation.
- Initial Setup
The Getting Started guide provides a rapid onboarding path for grasping APISIX fundamentals.
Explore the extensive catalog of available plugins for advanced configuration.
- Runtime Control via Admin API
Apache APISIX exposes a RESTful Admin API for dynamic configuration adjustments across the cluster.
- Developing Custom Logic
Refer to the plugin development guide and examine the source code of the sample example-plugin.
Understanding the plugin lifecycle is essential for custom development.
For comprehensive reference materials, visit the Apache APISIX Documentation Portal
Performance Validation
On an AWS server instance equipped with eight cores, APISIX consistently delivers a QPS exceeding 140,000 while maintaining an average latency below 0.2 ms.
The source code for the Benchmark script is publicly available for community testing and contribution.
APISIX demonstrates equivalent performance when deployed on AWS Graviton3 C7g instances.
Success Stories
- European eFactory Platform: Establishing API Security with APISIX
- Copernicus Reference System Software Deployment
- Browse Additional Case Studies
Global Adopters
APISIX API Gateway is utilized across numerous organizations for research initiatives, critical production systems, and commercial offerings. Key users include:
- Airwallex
- Bilibili
- CVTE
- European eFactory Platform
- European Copernicus Reference System
- Geely
- HONOR
- Horizon Robotics
- iQIYI
- Lenovo
- NASA JPL
- Nayuki
- OPPO
- QingCloud
- Swisscom
- Tencent Game
- Travelsky
- vivo
- Sina Weibo
- WeCity
- WPS
- XPENG
- Zoom
Visual Identity
Inspirations
This project draws conceptual influence from Kong and Orange Gateways.
Governance and Legal
Released under the Apache 2.0 License
WIKIPEDIA CONTEXT: Cloud Computing Fundamentals (for context)
Cloud computing is defined as "a paradigm for enabling network access to a scalable and elastic pool of shareable physical or virtual resources with self-service provisioning and administration on-demand," according to the ISO standard. It is often colloquially termed "the cloud."
== Essential Characteristics (NIST Definition) == In 2011, the U.S. National Institute of Standards and Technology (NIST) outlined five core attributes essential for a true cloud system:
- On-demand Self-Service: A consumer must be able to procure computing capacity (like server cycles or storage) automatically as required, without requiring direct human intervention from the provider for each request.
- Broad Network Access: Services must be available across the network via standard protocols, supporting diverse client devices (mobile, desktop, tablet).
- Resource Pooling: The provider aggregates computing assets to serve multiple distinct consumers (multi-tenancy), dynamically allocating and reallocating resources based on current demand.
- Rapid Elasticity: Capabilities must scale up or down swiftly, often automatically, to match fluctuating demand. To the user, these resources should appear virtually infinite.
- Measured Service: Resource consumption (e.g., processing time, data transfer) is automatically tracked, controlled, and reported, ensuring transparency for both service provider and consumer.
By 2023, the International Organization for Standardization (ISO) had subsequently broadened and refined this foundational list.
== Historical Evolution ==
The foundational concepts of cloud computing trace back to the 1960s with the popularization of time-sharing systems, often utilizing Remote Job Entry (RJE). The dominant model then involved users submitting tasks to dedicated operators who ran them on centralized mainframes. This period was characterized by intense experimentation focused on maximizing the accessibility and efficiency of massive computing power through time-sharing techniques.
The graphical representation of services as "the cloud" originated in 1994, used by General Magic to delineate the abstract space where their mobile agents could operate within the Telescript framework. This metaphor is often attributed to David Hoffman, a General Magic specialist, who adapted it from its pre-existing use in telecommunications and networking contexts. The term "cloud computing" gained widespread recognition in 1996 when Compaq Computer Corporation drafted a business strategy outlining future internet-based computation, signaling an early ambition toward vastly scaled, distributed services.
