Observability

  • Added

    • Add labels to Deployment, DaemonSet, StatefulSet metrics: app.kubernetes.io/version, helm.toolkit.fluxcd.io/name, helm.toolkit.fluxcd.io/namespace
  • Changed

    • Upgrade Tempo chart from to 1.61.3 to 2.4.2
      • Upgrades Tempo from 2.9.0 to 2.10.1
    • Upgrade Tempo Vulture chart from 0.12.5 to 0.12.6
  • Changed

    • Update kube-prometheus-stack to 20.1.0
  • Added

    • Add Gateway API HTTPRoute resources for Loki, Mimir, and Tempo (read and write), replacing the previous NGINX ingress setup.
    • Add native JWT authentication via Envoy Gateway SecurityPolicy.jwt, supporting multiple OIDC providers (e.g. Dex, Azure AD). Configurable via auth.jwt.providers.
    • Add /loki/api/v1/rules to the Loki read routes.
    • Add GRPCRoute for Tempo gRPC traffic (port 9095), routing all tempopb.* services to tempo-query-frontend with JWT enforcement via SecurityPolicy.

    Changed

    • Replace NGINX ingress-based auth (nginx.ingress.kubernetes.io/auth-url) with Envoy Gateway SecurityPolicy JWT validation — no external auth service (oauth2-proxy or Dex extAuth) required.
    • Change missing X-Scope-OrgID response code from 400 to 401 across all routes.
    • When auth.jwt.providers is empty and a service is enabled, routes are silently not rendered (no chart error). Previously the chart would fail with an error.
    • Fix Tempo gRPC route service regex from tempopb to tempopb\.[^/]+ to correctly match package-qualified service names (e.g. tempopb.StreamingQuerier).

    Removed

    • Remove dependency on oauth2-proxy for write route authentication.
    • Remove Envoy Gateway Backend CRD and extAuth configuration in favour of inline JWT validation.
  • Changed

    • Upgraded chart dependency to kube-prometheus-stack-82.8.0
      • Added VPA support for Alertmanager
      • Added VPA support for Prometheus
      • grafana from 11.2.2 to 11.2.3
  • Changed

    • Upgraded chart dependency to kube-prometheus-stack-82.8.0
      • Added VPA support for Alertmanager
      • Added VPA support for Prometheus
      • grafana from 11.2.2 to 11.2.3
  • Added

    • Add KSM metrics for Envoy Gateway resources.
    • Add application.giantswarm.io/team annotation from HelmReleases as label to KSM emitted metrics.

    Changed

    • Change team annotation in Chart.yaml to OpenContainers format (io.giantswarm.application.team).
    • Update alloy-app to 0.17.1
    • Update kube-prometheus-stack to 20.0.0
    • Update prometheus-operator-crd to 20.0.0
  • Changed

    • Upgrade Alloy upstream chart from 1.6.0 to 1.6.1 (CHANGELOG)
      • This bumps the version of Alloy from 1.13.0 to 1.13.2 (CHANGELOG)
  • Fixed

    • Fix memory usage calculation in nodes-overview and cluster-overview dashboards by using node_memory_MemAvailable_bytes instead of node_memory_MemFree_bytes, which incorrectly excluded cached/buffered memory from free memory

    Changed

    • Update DNS dashboard
      • Add new node and pod filters
      • Update variables description and query to use coredns_build_info as label source
      • Remove cache prefetch panel since the metric is gone
      • Fix DNS dashboard log panels
    • Kube-Builder Operators dashboard: add a logs datasource selector
  • Changed