Loki Rules
Prometheus Alerting Rules for Loki
This section details example Prometheus alerting rules configured for Loki. These rules help in monitoring and alerting on specific log patterns or conditions within your logging infrastructure. Effective alerting is crucial for maintaining the health and performance of your systems.
HighThroughputLogStreams Alert
The following rule, HighThroughputLogStreams
, is designed to detect a high volume of log streams within a specified time frame. This can indicate an issue such as excessive logging, a potential denial-of-service attack, or a misbehaving application.
groups:
- name: example
rules:
- alert: HighThroughputLogStreams
expr: sum by (container_name) (count_over_time({container_name=~".*"} |regexp`(?P.*)` [1h])>0)
for: 20s
labels:
severity: "2"
annotations:
description: '{{ $labels.instance }} {{ $labels.msg }} memory.'
Understanding the Alert Expression
The expr
field defines the condition for triggering the alert. In this case, it sums the count of log entries over the last hour for each container_name
. If this sum is greater than 0, and the condition persists for 20 seconds (for: 20s
), the alert is fired. The regexp
part is a placeholder and should be replaced with a specific pattern relevant to your logs.
Alert Labels and Annotations
labels
provide metadata for the alert, such as its severity. annotations
offer more detailed information, like a descriptive message that can include dynamic labels from the log data, helping operators quickly understand the context of the alert.
Best Practices for Loki Alerting
When defining Loki alerting rules, consider the following:
- Specificity: Make your log pattern matching (
regexp
) as specific as possible to avoid false positives. - Thresholds: Carefully tune the thresholds (e.g., the time window and count) to match your system's normal behavior.
- Context: Ensure your annotations provide enough context for quick diagnosis.
- Severity: Assign appropriate severity levels to prioritize responses.
For more advanced configurations and best practices regarding Prometheus alerting rules, refer to the Prometheus Alerting Rules documentation and Loki LogQL documentation.