← All cheatsheets

Observability

Prometheus + PromQL — observability metrics that matter

Prometheus collects time-series metrics. PromQL queries them. Master 6 query patterns and answer 90% of operational questions.

## Basic queries

```promql
http_requests_total
http_requests_total{job="api"}                      # filter by label
http_requests_total{status=~"5.."}                  # regex (5xx)
```

## Rate (most common pattern)

```promql
rate(http_requests_total[5m])                       # rps over 5m
sum by (job) (rate(http_requests_total[5m]))        # rps by service
rate(http_requests_total{status=~"5.."}[5m])
  / rate(http_requests_total[5m])                   # 5xx error rate
```

## Aggregation

```promql
sum(node_memory_MemAvailable_bytes) by (instance)
avg by (job) (rate(http_request_duration_seconds_sum[5m]))
quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
```

## Histogram percentiles

```promql
histogram_quantile(0.95,
  sum by (le) (rate(http_request_duration_seconds_bucket[5m])))  # p95 latency
```

## Alerting expressions

```promql
avg by (instance) (rate(node_cpu_seconds_total{mode!="idle"}[5m])) > 0.8
increase(kube_pod_container_status_restarts_total[1h]) > 0
predict_linear(node_filesystem_avail_bytes[1h], 4 * 3600) < 0
```

## Tip

If PromQL feels foreign, give AI the metric name + question in plain English. Ask for the PromQL. Then ask it to explain back what the query computes.