Highlights for the week ending 2024-09-26

Observability

  • dashboardsversion 3.24.0

    • Updated Alertmanager dashboard to show related logs.
    • Add Loki mixins dashboards update script.
    • Update Mimir mixins dashboards via script.
    • Fix Alloy mixin tags.
  • alloy-app version 0.5.2 introduces the following changes:

    • Add a helm chart templating test to the ci pipeline.
    • Add tests with ats in the CI pipeline.
    • Push alloy as a gateway component in collections.
  • kyverno-policies-observability version 0.5.0

    • Remove the policy for ServiceMonitor and PodMonitor relabelling schemas as we no longer need the enforcement.
  • fluent-logshipping-app version 5.2.2

    • Fix the Nginx Parser based on the upstream parser.
  • logging-operator version 0.12.1

    • Fix usage of structured metadata for clusters before v20.
    • Move high cardinality values into structured metadata.
    • Add Kubernetes audit log resource label, filename label, and output stream label.
    • Rename the node_name label into node to match the metric label.
  • loki-app version 0.24.0

    • Add “manual e2e” testing procedure.
    • Add PR message template referring to the manual testing procedure.
  • observability-bundle version 1.6.2:

    • Fixed alloyMetrics catalog
  • observability-operator version 0.6.0:

    • Require observability-bundle >= 1.6.2 for Alloy monitoring agent support; this is due to the incorrect alloyMetrics catalogue in observability-bundle
    • Fix invalid Alloy config due to missing comma on external labels
    • Disable logger development mode to avoid panicking; use zap as a logger.
    • Fix CircleCI release pipeline.
    • Add manual e2e testing procedure and script.
  • prometheus-meta-operator version 4.79.0:

    • Remove unused #alert and #alert-test-installation slack integration.
  • prometheus-rules version 4.15.2:

    • Update MimirHPAReachedMaxReplicas operation recipe link
    • Fix aggregation rule of the slo:current_burn_rate:ratio slo.
    • Remove aggregation of slo:period_error_budget_remaining:ratio` as this value can be easily computed and creates a lot of time series in Grafana Cloud
    • Add aggregations for SLO metrics to export them to the Grafana cloud
    • Add MimirHPAReachedMaxReplicas alert to detect when Mimir’s HPAs have reached maximum capacity.
    • Added dashboards to several Mimir alerts
    • Change IRSAACMCertificateExpiringInLessThan60Days to IRSAACMCertificateExpiringInLessThan45Days. The ACM certificate is renewed 60 days before expiration, and the alert can fire prematurely.
  • tekton-dashboard-loki-proxy version 0.4.0:

    • Change app.giantswarm.io/* labels to application.giantswarm.io/
    • Update Golang to v1.23.1

Cluster management

  • aws-pod-identity-webhook version 1.17.0:

    • Fix VPA being ineffective due to referring to a non-existing Deployment name
  • aws-crossplane-cluster-config-operator version 0.3.0

    • Configure the Crossplane ProviderConfig to use the CAPA controller role directly without going through a middleman. For this to work, the CAPA controller must have the correct trust policy granting access to the Crossplane provider’s service account.
    • Write a value oidcDomains to the config map containing all service account issuer domains, as defined by the new aws.giantswarm.io/irsa-trust-domains annotation on the AWSCluster. The primary domain is still written to value oidcDomain.
  • cluster version 1.4.1

    • Remove deprecation message for customNodeLabels and customNodeTaints, because they are not deprecated.
    • Allow configuring kube-controller-manager --node-cidr-mask-size flag.
    • Chart: Support multiple service account issuers.\ Change providerIntegration.controlPlane.kubeadmConfig.clusterConfiguration.apiServer.serviceAccountIssuer to plural providerIntegration.controlPlane.kubeadmConfig.clusterConfiguration.apiServer.serviceAccountIssuers and render them in the specified order as --service-account-issuer parameters for the API server.
    • Only add the customNodeLabels value to the kubelet node-labels argument in the KubeadmConfig when customNodeLabels is defined.

Connectivity

Security

  • kyverno-policies-dx version 0.5.1

    • Use Enforce and Audit validationFailureAction.
  • kyverno-policies-ux version 0.7.3

    • cluster-names now targets Cluster by GVK
    • Use Enforce validationFailureAction.
  • kyverno-app version 0.18.0

    • Update Kyverno to the upstream version v1.12.5.
  • kyverno-crds version 1.12.0

    • Update Kyverno CRDs to Kyverno v1.12.
  • kyverno-policies version 0.21.0

    • Update to upstream Kyverno Policies version 1.12.5.
    • Don’t push to vsphere-app-collection, capz-app-collection, capa-app-collection or cloud-director-app-collection. We started to consume kyverno-policies from security-bundle.