NODE_METRICS
Explore essential Prometheus queries for Node Exporter metrics, covering system load, CPU, memory, disk I/O, network throughput, and uptime for effective server monitoring.
Prometheus Node Metrics
Prometheus Node Metrics Examples
This page provides essential Prometheus query examples for Node Exporter, focusing on collecting and analyzing critical system-level metrics. These queries are vital for comprehensive server monitoring and performance tuning.
System Load Monitoring
Understanding system load is crucial for identifying potential performance bottlenecks. The following query calculates the system load as a percentage, normalized by the number of CPU cores available.
avg(node_load1{instance="my-instance-name",job="node-exporter"}) / count(count(node_cpu_seconds_total{instance="my-instance-name",job="node-exporter"}) by (cpu)) * 100
CPU Utilization Analysis
Monitor CPU usage to ensure your systems are not overloaded. This query calculates the non-idle CPU time over the last 5 minutes, providing a clear view of CPU utilization.
100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle", instance="my-instance-name"}[5m])) * 100)
Memory Usage Insights
Effective memory management is key to system stability. Here are queries to track available memory and identify memory pressure.
Memory Available in Percentage:
node_memory_MemAvailable_bytes{instance="my-instance-name"} / node_memory_MemTotal_bytes{instance="my-instance-name"} * 100
Memory Pressure (Major Page Faults):
rate(node_vmstat_pgmajfault{instance="my-instance-name"}[1m])
Disk Space and Performance
Keep an eye on disk capacity and performance to prevent I/O issues and storage exhaustion.
Disk Space Available in Bytes:
node_filesystem_avail_bytes{instance=~"my-ec2-instance",job=~"node-exporter",mountpoint="/"}
Disk Space Available in Percentage:
(node_filesystem_avail_bytes{mountpoint="/", instance=~"my-ec2-instance"} * 100) / node_filesystem_size_bytes{mountpoint="/", instance=~"my-ec2-instance"}
Disk Latencies (Read/Write):
rate(node_disk_read_time_seconds_total{instance="my-instance-name"}[1m]) / rate(node_disk_reads_completed_total{instance="my-instance-name"}[1m])
rate(node_disk_write_time_seconds_total{instance="my-instance-name"}[1m]) / rate(node_disk_writes_completed_total{instance="my-instance-name"}[1m])
Network Throughput Monitoring
Track network traffic to understand bandwidth usage and identify potential network congestion.
irate(node_network_receive_bytes_total{instance="my-instance-name"}[5m]) * 8
irate(node_network_transmit_bytes_total{instance="my-instance-name"}[5m]) * 8
Node Uptime Calculation
Verify system availability by calculating the node's uptime.
node_time_seconds{instance="my-ec2-instance",job="node-exporter"} - node_boot_time_seconds{instance="my-ec2-instance",job="node-exporter"}
These Prometheus queries, when used with Node Exporter, provide a robust foundation for monitoring your server infrastructure. For more advanced visualizations and alerting, consider integrating these metrics into Grafana dashboards.
External Resources: