TSDB-IoT
A specialized database solution engineered for the rigorous demands of Industrial Internet of Things (IIoT) time-series data management, focusing on high-efficiency ingestion, storage, and analytical capabilities. It features native interoperability with distributed processing frameworks like Apache Hadoop and Spark, leveraging familiar SQL-like query interfaces.
Author

apache
Quick Info
Actions
Tags
English | 中文
IoTDB Data Repository
System Synopsis
IoTDB (Time Series Database for IoT) functions as a comprehensive data management infrastructure tailored specifically for sequential timestamped measurements. It furnishes users with capabilities spanning data acquisition, persistent storage, and sophisticated analytical processing. Its lightweight architecture, coupled with superior throughput characteristics and robust utility features, especially its streamlined integration pathways into the established Hadoop and Spark computational environments, positions IoTDB as a definitive solution for industrial data challenges involving massive-scale retention, high-velocity data injection, and intricate analytical workloads.
IoTDB relies heavily on the underlying TsFile format (see its repository: TsFile Link), which is a column-oriented file structure optimized for time-series data records. The specific 'iotdb' branch within the TsFile project is utilized for deploying the SNAPSHOT versions of the IoTDB codebase.
Core Capabilities
The primary functional attributes distinguishing IoTDB include:
- Flexible Deployment Schema: Offers streamlined deployment utilities, supporting both single-click installation across cloud infrastructure and edge devices, alongside dedicated tools for inter-platform data replication.
- Hardware Cost Optimization: Achieves substantial reductions in storage footprint via industry-leading data compression ratios.
- Advanced Hierarchical Organization: Facilitates efficient structuring of complex data schemas originating from diverse networked sensors and devices, supporting powerful pattern-matching and fuzzy retrieval across vast, intricate directory structures.
- High-Velocity I/O Performance: Engineered to sustain concurrent connections from millions of low-power monitoring units, ensuring extremely fast data ingress and retrieval operations for mixed device environments.
- Comprehensive Query Semantics: Supports crucial time-series operations such as cross-device temporal alignment, advanced statistical computation within the time-series domain (e.g., frequency-domain transforms), and a rich suite of time-based aggregation functions.
- Accessibility and Ease of Use: Adopts a familiar SQL-like querying language, adheres to the JDBC standard for programmatic access, and includes built-in data utility tools for data loading and extraction.
- Ecosystem Synergy: Seamlessly interfaces with leading open-source analytical stacks (e.g., Hadoop, Spark) and visualization platforms (e.g., Grafana).
For the most current documentation and updates, consult the IoTDB official website. Should you encounter operational issues or discover software defects, please formally submit a report via the Jira Issue Tracker.
Contents Navigator
- IoTDB Data Repository
- System Synopsis
- Core Capabilities
- Contents Navigator
- Rapid Initialization Guide
- Prerequisites Check
- Deployment Steps
- Service Startup
- Server-Only Compilation
- CLI-Only Compilation
- CSV Data Transfer Utilities Usage
Rapid Initialization Guide
This concise guide outlines the initial procedure for engaging with the IoTDB system. For exhaustive documentation, refer to the User Guide.
Prerequisites Check
Essential dependencies required for operation:
- Java Runtime Environment (JRE) version 1.8 or newer (versions 1.8, 11, through 17 are validated; ensure correct environment variables are configured).
- Apache Maven version 3.6+ (Necessary only when compiling and installing from source code).
- System configuration adjustment: Increase the maximum number of open file descriptors to 65535 to mitigate potential 'too many open files' errors.
-
(Optional) System tuning: Set
somaxconnkernel parameter to 65535 to prevent 'connection reset' issues under heavy transactional loads.Linux
sudo sysctl -w net.core.somaxconn=65535
FreeBSD or Darwin
sudo sysctl -w kern.ipc.somaxconn=65535
Linux Environment Setup (Ubuntu 22.04 Base)
Git Version Control
Install Git if absent:
sudo apt install git
Java Development Kit
Install the default JDK if absent:
sudo apt install default-jdk
Compilation Dependencies
Required tools for source compilation:
sudo apt install flex
sudo apt install bison
sudo apt install libboost-all-dev
SSL Development Headers
Ensure OpenSSL development libraries are present:
sudo apt install libssl-dev
Mac OS Environment Setup
Git Initialization
Executing git usually prompts the installation of the necessary Mac developer tools upon first use.
Homebrew Package Manager
Install Homebrew if it is not already present, as it will manage subsequent installations:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Java Installation
Install Java via Homebrew:
brew install java
Link the Java installation based on processor architecture:
*Intel Macs:
sudo ln -sfn /usr/local/opt/openjdk/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk
*ARM (Apple Silicon) Macs:
sudo ln -sfn /opt/homebrew/opt/openjdk/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk
C++ Client Prerequisites (If compiling with C++ support)
Required for building the Thrift C++ interface:
brew install boost
brew install bison
brew install openssl
Windows Environment Setup
Chocolatey Package Manager
Install Chocolatey if not yet installed, as it will be used for dependency management:
https://chocolatey.org/install
Core Tools Installation via Choco
choco install git.install
choco install openjdk
choco install visualstudio2022community
choco install visualstudio2022buildtools
choco install visualstudio2022-workload-nativedesktop
choco install winflexbison
choco install boost-msvc-14.2
choco install openssl
Deployment Steps
IoTDB supports three primary installation paths: direct compilation from source (for code modification), downloading pre-built binary releases (recommended for immediate use), or containerization via Docker (source for Dockerfiles: link).
This guide focuses on the source code compilation method.
Building from Source Code
Prepare Thrift Compiler
This step is generally skipped on Windows platforms.
Because our Remote Procedure Call (RPC) module relies on Apache Thrift for communication protocols, the Thrift compiler (version 0.13.0 or newer) must be present for generating necessary Java interface code during compilation. While Windows users can use official binaries, Unix-like systems often require manual compilation or alternative fetching methods.
If system package managers are available, use apt, yum, or brew to install Thrift. If manual compilation is necessary, the Boost library must be installed first. To simplify this for common Unix environments (gcc8+ on Ubuntu/MacOS/CentOS), a pre-compiled Unix binary is hosted on GitHub and automatically retrieved by a Maven plugin during the build process.
If automated downloading fails due to network restrictions, you must manually acquire the compiler and either:
1. Rename the binary to {project_root}\thrift\target\tools\thrift_0.12.0_0.13.0_linux.exe.
2. Or, explicitly pass Maven arguments: -Dthrift.download-url=http://apache.org/licenses/LICENSE-2.0.txt -Dthrift.exec.absolute.path=<YOUR LOCAL THRIFT BINARY FILE>.
Source Code Acquisition and Version Checkout
Clone the repository:
git clone https://github.com/apache/iotdb.git
To target a specific stable release (e.g., x.x.x), check out the corresponding tag:
git checkout vx.x.x
Alternatively, target a major release branch:
git checkout rel/x.x
Compile IoTDB Distribution Package
Execute the build command from the root directory of the cloned repository:
mvn clean package -pl distribution -am -DskipTests
The final deployable artifact package will reside in: distribution/target.
Only build cli
Navigate to the client module directory:
mvn clean package -pl cli -am -DskipTests
The resulting CLI executable artifact will be in: cli/target.
Build Others
To compile the C++ client component, use the profile flag:
mvn clean package -P with-cpp
(Refer to client-cpp's local Readme for further specific details.)
IDE Configuration Note: After a successful Maven build, you must manually register the following directories as source roots within your IDE (like IntelliJ IDEA) to resolve compilation dependency issues: thrift/target/generated-sources/thrift, thrift-sync/target/generated-sources/thrift, thrift-cluster/target/generated-sources/thrift, thrift-influxdb/target/generated-sources/thrift, and antlr/target/generated-sources/antlr4. For IntelliJ users, simply right-clicking the root project and selecting "Maven->Reload Project" after the mvn package execution usually suffices.
Parameter Adjustments
Operational settings are governed by files located in the "conf" directory:
- Startup Environment Definitions: (
datanode-env.bat,datanode-env.sh) - Core System Properties: (
iotdb-system.properties) - Logging Configuration: (
logback.xml).
Detailed configuration directives are available in the Configuration Manual.
Service Startup
You can verify the installation success by following these initiation sequence steps; successful completion implies no execution errors.
Activating IoTDB Instance
Launch a single-node, single-data-node setup using the provided startup script in the sbin directory:
Unix/OS X
sbin/start-standalone.sh
Windows
sbin\start-standalone.bat
Interacting with IoTDB
Using the Command Line Interface (CLI)
IoTDB offers multiple interaction vectors; this section details basic data insertion and retrieval via the CLI. Upon installation, the default administrative credentials are username 'root' and password 'root'. The CLI launcher script is named start-cli in the sbin folder. When invoking it, you must specify connection details (IP, Port, Username, Password). Default connection arguments are -h 127.0.0.1 -p 6667 -u root -pw root.
CLI invocation examples:
Unix/OS X
sbin/start-cli.sh -h 127.0.0.1 -p 6667 -u root -pw root
Windows
sbin\start-cli.bat -h 127.0.0.1 -p 6667 -u root -pw root
The CLI operates interactively. A successful launch will display the welcome banner and confirmation message:
_ _ __ _
| | | _ _ || _ .|_ _ \
| | .--.|_/ | | \_| | |. \ | |_) |
| | / .'`\ \ | | | | | | | '.
| || _. | | | | |.' /| |) |
|_____|'..' |_| |_.'|____/ version x.x.x
IoTDB> login successfully IoTDB>
Fundamental IoTDB Operations
Data within IoTDB is organized hierarchically into time-series paths, each belonging to a specific database. A time-series definition (schema) requires specifying its data type and the chosen storage encoding scheme. First, define a namespace using CREATE DATABASE, for example:
IoTDB> CREATE DATABASE root.ln
Verify creation with SHOW DATABASES:
IoTDB> SHOW DATABASES +-------------+ | Database| +-------------+ | root.ln| +-------------+ Total line number = 1
Next, define specific measurement series within that database, specifying type and encoding. Example definition for a boolean status and a floating-point temperature sensor:
IoTDB> CREATE TIMESERIES root.ln.wf01.wt01.status WITH DATATYPE=BOOLEAN, ENCODING=PLAIN IoTDB> CREATE TIMESERIES root.ln.wf01.wt01.temperature WITH DATATYPE=FLOAT, ENCODING=RLE
To inspect the defined schemas, use SHOW TIMESERIES <Path>. By default, omitting the path shows all series in the system (equivalent to SHOW TIMESERIES root).
- Viewing all series in the system:
IoTDB> SHOW TIMESERIES +-----------------------------+-----+-------------+--------+--------+-----------+----+----------+ | Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes| +-----------------------------+-----+-------------+--------+--------+-----------+----+----------+ |root.ln.wf01.wt01.temperature| null| root.ln| FLOAT| RLE| SNAPPY|null| null| | root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| +-----------------------------+-----+-------------+--------+--------+-----------+----+----------+ Total line number = 2
- Querying a specific path (
root.ln.wf01.wt01.status):
IoTDB> SHOW TIMESERIES root.ln.wf01.wt01.status +------------------------+-----+-------------+--------+--------+-----------+----+----------+ | timeseries|alias|database|dataType|encoding|compression|tags|attributes| +------------------------+-----+-------------+--------+--------+-----------+----+----------+ |root.ln.wf01.wt01.status| null| root.ln| BOOLEAN| PLAIN| SNAPPY|null| null| +------------------------+-----+-------------+--------+--------+-----------+----+----------+ Total line number = 1
Data insertion uses the INSERT INTO command, requiring a timestamp and the corresponding measurement value(s):
IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status) values(100,true); IoTDB> INSERT INTO root.ln.wf01.wt01(timestamp,status,temperature) values(200,false,20.71)
Querying the inserted status data yields:
IoTDB> SELECT status FROM root.ln.wf01.wt01 +------------------------+------------------------+ | Time|root.ln.wf01.wt01.status| +------------------------+------------------------+ |1970-01-01T00:00:00.100Z| true| |1970-01-01T00:00:00.200Z| false| +------------------------+------------------------+ Total line number = 2
You can query multiple series using SELECT *:
IoTDB> SELECT * FROM root.ln.wf01.wt01 +------------------------+-----------------------------+------------------------+ | Time|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| +------------------------+-----------------------------+------------------------+ |1970-01-01T00:00:00.100Z| null| true| |1970-01-01T00:00:00.200Z| 20.71| false| +------------------------+-----------------------------+------------------------+ Total line number = 2
To modify the display time zone in the CLI, use the SET time_zone command:
IoTDB> SET time_zone=+08:00 Time zone has set to +08:00 IoTDB> SHOW time_zone Current time zone: Asia/Shanghai
Subsequent queries will reflect the new time zone offset:
IoTDB> SELECT * FROM root.ln.wf01.wt01 +-----------------------------+-----------------------------+------------------------+ | Time|root.ln.wf01.wt01.temperature|root.ln.wf01.wt01.status| +-----------------------------+-----------------------------+------------------------+ |1970-01-01T08:00:00.100+08:00| null| true| |1970-01-01T08:00:00.200+08:00| 20.71| false| +-----------------------------+-----------------------------+------------------------+ Total line number = 2
To exit the interactive CLI session, use:
IoTDB> quit or IoTDB> exit
Comprehensive details regarding all supported SQL commands are accessible in the User Guide.
Terminating IoTDB Instance
The running server process can be halted gracefully using the dedicated script or via standard interruption signals:
Unix/OS X
sbin/stop-standalone.sh
Windows
sbin\stop-standalone.bat
The use of Data Import and Export Tool
Refer to the specific documentation for data utilities:
Data Ingestion: Data Import Tool Documentation Data Extraction: Data Export Tool Documentation
Frequent Questions for Compiling
For troubleshooting compilation issues, consult the dedicated guide: Compiling Source Code FAQs
Contact Us
QQ User Group
- Apache IoTDB General Support Channel ID: 659990460
Wechat Community
- Add contact ID:
apache_iotdbto receive an invitation to the private group.
Real-time Chat (Slack)
For broader community engagement instructions, see How to Join the Community.
WIKIPEDIA ENTRY CONTEXT: Cloud computing represents a service delivery model for on-demand access to a shared pool of configurable computing resources (virtual or physical) that can be rapidly provisioned and managed with minimal administrative overhead, as defined by ISO standards. Colloquially, this is known as "the cloud."
== Essential Characteristics (NIST 2011 Definition) == NIST established five indispensable traits for authentic cloud systems:
- On-demand self-service: Consumers possess the autonomy to provision computing capacity (like server time or storage) automatically, without requiring direct intervention from the service provider staff for each request.
- Broad network accessibility: Services must be accessible via standard network protocols, supporting a diverse array of client devices (mobile, desktop, tablet).
- Resource pooling: Provider infrastructure resources are aggregated to serve multiple tenants concurrently, dynamically allocating and reallocating physical/virtual assets based on fluctuating demand.
- Rapid elasticity: Capabilities can be scaled up or down quickly, sometimes autonomously, to match demand fluctuations. From the consumer's perspective, available capacity often appears infinite.
- Measured service: Resource consumption (storage, processing cycles, bandwidth) is automatically tracked, controlled, and reported, ensuring transparency for both the consumer and the provider regarding utilized capacity.
ISO has since refined and expanded these characteristics as of 2023.
== Historical Context ==
The conceptual roots of modern cloud computing trace back to the 1960s, specifically with the rise of time-sharing systems facilitating remote job execution (RJE). In that era, mainframe operators managed job queues submitted by users. This period was characterized by intensive research into optimizing infrastructure, platforms, and applications to maximize computational access efficiency for a broader user base.
The visual representation of 'the cloud' for networked services emerged in 1994, used by General Magic to denote the accessible environment for their Telescript mobile agents. David Hoffman, a specialist at General Magic, is credited with adopting this metaphor based on its established use in telecommunications. The term "cloud computing" gained wider traction in 1996 following a Compaq Computer Corporation business strategy document outlining future internet computing models. The firm's objectives involved significantly...
