Alerting with the Observability Platform

Documentation on the observability-platform alerting concept and architecture deployed and maintained by Giant Swarm.

Alerting is an important concept of any observability solution and it’s thus only natural that it is available as part of the Giant Swarm Observability Platform. For more details and information on alerting please visit the official Grafana documentation page.

How alerting works

Alerting is usually divided into two distinct concepts: the alerting pipeline (how to send alerts, to whom and what to send) and alerting/recording rules (what to alert on). This documentation will cover how those two topics work in the Observability Platform.

As our alerting pipeline supports multi-tenancy, we strongly advocate that you first familiarize yourself with our multi-tenancy concept.

The alerting pipeline

alerting pipeline

As you can see in the image above, the alerting pipeline is quite straightforward. The Loki and Mimir rulers evaluate alerting rules and send alerts to the Mimir Alertmanager. The Mimir Alertmanager (a multi-tenant aware Alertmanager) routes those alerts to configured receivers.

If you want to learn how to configure Alertmanager for your tenants, please refer to our dedicated documentation.

Loading alerting and recording rules

loading recording and alerting rules

The Observability Platform allows you to create and load both alerting and recording rules into:

  • the Mimir ruler (metric-based alerts)
  • the Loki ruler (*log-based alerts) Alerting and recording rules can be loaded from both management cluster and workload clusters alike via our Grafana Alloy agents.

If you want to learn how to configure your own, please, refer to our dedicated documentation.

Alerting Overview in Grafana

If you want to want to learn more about the configuration of the alerting pipeline and the alerting rules for a tenant, you can find this information in the Alerting section of your installation’s Grafana

Grafana alerting section

In this section, you have access to various features such as:

  • Alerts rules: all (alerts and recording) rules currently available, which can be filtered by state like firing or pending. When unfolding an alert rule you can use the see graph link to jump to an explore page with the alert’s expression pre-filled.
  • Contact points: configured integrations (for example opsgenie or slack) to send alerts to, along with notification templates used to format alerts when sent out.
  • Notification policies: alerts routing which defines how alerts are sent to contact points based on matching criteria.
  • Silences: silences currently loaded and their state along with the affected alerts.
  • Active notifications: currently firing alerts. It might be confused with the Alerts rules page at first, but this page differs in the fact that it only shows alert currently firing along with the notification state.
  • Settings: general settings for the Alertmanager instance, also show the currently loaded configuration

This part of our documentation refers to our vintage product. The content may be not valid anymore for our current product. Please check our new documentation hub for the latest state of our docs.