Changes and Releases

Updates on Giant Swarm workload cluster releases, apps, UI improvements and documentation changes.

  • Fixed

    • Fixed duplicate entry in ServiceMonitor resources.
  • Fixed

    • Fixed duplicate entry in ServiceMonitor resources.
  • Fixed

    • Fixed unique user tracking
    • Remove debug logging regarding telemetry
  • Fixed

    • Fix fluent-bit image to add the missing auditd libs to be able to use ausearch.
  • Changed

    • Upgrading to the v0.10.5 version. Includes breaking changes.
    • Update Kyverno PolicyExceptions to v2 and fallback to v2beta1.
  • Changed

    • Upgrading to the v0.10.5 version. Includes breaking changes.
    • Update Kyverno PolicyExceptions to v2 and fallback to v2beta1.
  • Highlights for the week ending 2024-10-31

    General

    • security-bundle version 1.9.0 introduces breaking changes. When upgrading to this version with Falco enabled, the Falco App may fail to upgrade due to a breaking change in the upstream chart. To complete the upgrade seamlessly, disable and then re-enable the Falco App by setting apps.falco.enabled=[false|true] in the security-bundle user values ConfigMap.

    Observability

    • dashboards version 3.26.0

      • Introduced “Loki - Slow Queries” dashboard for enhanced query performance insights.
      • Transferred ownership from BigMac to Shield for better team alignment.
      • Resynced alloy, loki, and mimir mixins from upstream to ensure feature parity.
    • logging-operator version 0.14.0

      • Default logging agent switched to Alloy, replacing Promtail for improved performance.
    • kube-prometheus-stack-app version 12.0.0

      • Updated chart dependency to kube-prometheus-stack-65.1.1.
      • Upgraded prometheus-operator from 0.75.0 to 0.77.1.
      • Prometheus upgraded from 2.53.0 to 2.54.1.
      • Grafana upgraded from 8.2.0 to 8.5.0.
      • Thanos ruler upgraded from 0.35.1 to 0.36.1.
      • Prometheus-node-exporter upgraded from 1.8.1 to 1.8.2.
      • Removed legacy in-house SLO framework to streamline integrations.
    • prometheus-operator-crd version 12.0.0

      • Upgraded CRDs chart from 13.0.2 (prometheus-operator 0.75.2) to 15.0.0 (prometheus-operator 0.77.1). See upstream changelog for more details.
    • prometheus-meta-operator version 4.81.0

      • Created new monitoring-agent inhibitions based on existing prometheus-agent configurations for tool-agnostic monitoring.
      • Added customer label to OpsGenie alerts to enhance alert specificity.
    • prometheus-rules version 4.23.0

      • Renamed all prometheus-agent related inhibitions to monitoring-agent inhibitions for clarity.
      • Standardized inhibition alert naming: InhibitionPrometheusAgentFailing and InhibitionPrometheusAgentShardsMissing.
      • Corrected statefulset.rules naming to avoid overwriting deployment.rules.
      • Adjusted KubeletVolumeSpaceTooLow alert threshold to only trigger when space is critically low, relying on node-problem-detector otherwise.
      • Updated aggregation:giantswarm:cluster_release_version expression to include Cluster API clusters.
      • Updated InhibitionControlPlaneUnhealthy for all Cluster API clusters, not just MCs.
      • Added alert for StatefulsetNotSatisfiedAtlas.
      • Updated alloy-app to 0.6.1, including an upgrade to upstream version 1.4.2 and a CiliumNetworkPolicy fix for clustering.
    • oauth2-proxy-app version 3.0.2

      • Implemented NetworkPolicy to allow traffic to oauth2-proxy.
      • Removed cert-manager ingress annotations to resolve ingress validation issues.
    • observability-bundle version 1.8.0

      • Upgraded prometheus-agent from v0.6.9 to v0.7.0.
      • Added extraArgs to enable features like WAL truncation.
      • Upgraded kube-prometheus-stack from 61.0.0 to 65.1.1.
      • Updated prometheus-operator CRDs from 0.73.0 to 0.75.0.
      • Prometheus-operator upgraded from 0.75.0 to 0.77.1.
      • Prometheus upgraded from 2.53.0 to 2.54.1.
      • Grafana upgraded from 8.2.0 to 8.5.0.
      • Thanos ruler upgraded from 0.35.1 to 0.36.1.
      • Prometheus-node-exporter upgraded from 1.8.1 to 1.8.2.
      • Added missing depends on annotations for alloy-metrics and alloy-logs to ensure correct deployment order.

    Security

    • kyverno-policies-connectivity version 0.6.1

      • Added /tmp emptyDir volume to workload cluster IP Job.
    • falco-app version 0.9.1

      • Introduced feature gates for enabling/disabling individual Falco components.
    • starboard-exporter version 0.8.0

      • Added Vertical Pod Autoscaler (VPA) configuration, enabled by default for optimized resource usage.
      • Disabled logger development mode to enhance stability.
      • Disabled PodSecurityPolicy by default.
      • Exposed port 8081 for health/liveness probes.
    • trivy-app version 0.13.0

      • Updated Trivy to upstream version v0.56.1 for enhanced security scanning.
      • Disabled PSPs.
    • trivy-operator-app version 0.10.2

      • Aligned Trivy versions between Trivy operator and the upstream project to v0.56.1.
    • security-bundle version 1.9.0

      • Updated kyverno (app) to v0.18.1.
      • Updated kyverno-crds (app) to v1.12.0.
      • Updated kyverno-policies (app) to v0.21.0.
      • Updated starboard-exporter (app) to v0.8.0.
      • Updated trivy-operator (app) to v0.10.2.
      • Updated trivy (app) to v0.13.0.
      • Updated falco (app) to v0.9.1.

    Connectivity

    • dns-operator-route53 version 0.10.0
      • Added optional --role-arn flag to specify the role ARN to assume when interacting with Route53.

    Fleet management

    • app-admission-controller version 0.26.2

      • Extended the /healthz endpoint to verify certificate validity and allow Kubernetes liveness probes to manage restarts if errors occur.
    • app-operator version 6.11.2

      • Updated dependencies to ensure compatibility and security.
  • In this release:

    • Cluster details page was added.
    • K8s resources management was refactored in GS plugins.
    • Auto generation of TS types and constant values for K8s resources was added in GS plugins. See ./docs/releases/v0.41.0-changelog.md for more information.
  • Changed

    • Upgrade prometheus-agent from v0.6.9 to v0.7.0.
      • Adds extraArgs to be able to use nice features like wal truncation
    • upgrade kube-prometheus-stack from 61.0.0 to 65.1.1
      • prometheus-operator CRDs from 0.73.0 to 0.75.0
      • prometheus-operator from 0.75.0 to 0.77.1
      • prometheus upgraded from 2.53.0 to 2.54.1
      • grafana from 8.2.0 to 8.5.0
      • thanos ruler upgraded from 0.35.1 to 0.36.1
      • prometheus-node-exporter upgraded from 1.8.1 to 1.8.2

    Fixed

    • Add missing depends on annotation on alloy-metrics and alloy-logs to make sure they are deployed after the prometheus-operator-crds.
  • Changed