Alertmanager - Configure Prometheus Alerting

Configure Prometheus Alertmanager to manage and route alerts effectively. This guide provides a sample Alertmanager configuration for Slack and Pushover notifications, handling different alert severities and environments.

Alertmanager

This document provides a sample Alertmanager configuration. Alertmanager is a critical component of the Prometheus monitoring system, responsible for receiving alerts from Prometheus and routing them to various notification channels. This configuration demonstrates how to set up routing based on alert severity and environment, sending notifications to Slack and Pushover.

Configuration Details

Explore the configuration details of Alertmanager, a crucial component for managing alerts from Prometheus. Learn how to set up global settings, routing rules, and receivers for effective notification management.

Global Settings

The global section defines overall settings such as the resolve_timeout (how long to wait before considering an alert resolved) and the slack_api_url (the webhook URL for Slack notifications).

Routing Rules

The route section defines how alerts are grouped and routed. Alerts are grouped by alertname, cluster, job, environment, and severity. The repeat_interval prevents duplicate alerts within a 24-hour period, and the group_interval aggregates alerts within a 5-minute window.

Receivers

The receivers section defines the notification channels. This example uses Slack and Pushover. Each receiver has its own configuration, specifying things like the channel, message formatting, and whether to send resolved alerts.

# https://prometheus.io/docs/alerting/latest/notification_examples/
# https://rtfm.co.ua/en/prometheus-alertmanagers-alerts-receivers-and-routing-based-on-severity-level-and-tags/
global:
  resolve_timeout: 5m
  slack_api_url: 'https://hooks.slack.com/services/x/x/x'

route:
  # https://www.robustperception.io/whats-the-difference-between-group_interval-group_wait-and-repeat_interval
  group_by: ['alertname', 'cluster', 'job', 'environment', 'severity']
  repeat_interval: 24h
  group_interval: 5m
  receiver: 'warning'
  routes:
    - match:
        severity: warning
      receiver: warning
      routes:
      - match_re:
          environment: .*(-prod).*
        receiver: critical

receivers:
  - name: 'warning'
    slack_configs:
      - send_resolved: true
        channel: "#notifications"
        title_link: 'http://alertmanager.localdns.xyz/prometheus/alerts'
        title: '{{ if eq .Status "firing" }}:flushed:{{ else }}:sunglasses:{{ end }} [{{ .Status | toUpper }}] {{ .CommonAnnotations.summary }} (warning)'
        text: "{{ range .Alerts }}*Priority*: `{{ .Labels.severity | toUpper }}`\nTarget: {{ .Labels.instance }}\n{{ .Annotations.description }}\n{{ end }}"

  - name: 'critical'
    slack_configs:
      - send_resolved: true
        channel: "#notifications"
        title_link: 'http://alertmanager.localdns.xyz/prometheus/alerts'
        title: '{{ if eq .Status "firing" }}:scream:{{ else }}:sunglasses:{{ end }} [{{ .Status | toUpper }}] {{ .CommonAnnotations.summary }} (critical)'
        text: "{{ range .Alerts }}*Priority*: `{{ .Labels.severity | toUpper }}`\n*Description*: {{ .Annotations.description }}\n{{ end }}"
    pushover_configs:
      - token: example-token
        user_key: example-key
        title: '{{ if eq .Status "firing" }}ALARM{{ else }}OK{{ end }} [{{ .Status | toUpper }}] {{ .CommonAnnotations.summary }}'
        message: '{{ template "pushover.default.message" . }}'
        url: '{{ template "pushover.default.url" . }}'
        priority: '{{ if eq .Status "firing" }}2{{ else }}0{{ end }}'

  - name: 'detailed-slack'
    slack_configs:
      - send_resolved: true
        title_link: 'http://alertmanager.localdns.xyz/#/alerts'
        title: '{{ if eq .Status "firing" }}:sadparrot:{{ else }}:fastparrot:{{ end }} [{{ .Status | toUpper }}] {{ .CommonAnnotations.title }} (instances)'
        channel: '#notifications'
        text: >-
          {{ range .Alerts }}
            *Alert:* {{ .Annotations.summary }} - `{{ .Labels.severity }}`
            *Description:* {{ .Annotations.description }}
            *Graph:* <{{ .GeneratorURL }}|:chart_with_upwards_trend:>
            *Runbook:* <{{ .Annotations.runbook }}|:spiral_note_pad:>
            *Details:*
            {{ range .Labels.SortedPairs }} • *{{ .Name }}:* `{{ .Value }}`
            {{ end }}
          {{ end }}
        username: 'AlertManager'
        icon_emoji: ':prometheus:'
Further Reading

For more detailed information on configuring Alertmanager, refer to the official Prometheus Alertmanager documentation.