Workload cluster releases for KVM

  • This release upgrades Kubernetes to 1.19. A summary of relevant changes is included in these release notes. The release also includes other minor component updates summarized below the list of Kubernetes changes.

    Change details

    kubernetes 1.19.9

    Expanded CLI support for debugging workloads and nodes

    SIG CLI expanded on debugging with kubectl to support two new debugging workflows: debugging workloads by creating a copy, and debugging nodes by creating a container in host namespaces. These can be convenient to:

    • Insert a debug container in clusters that don’t have ephemeral containers enabled
    • Modify a crashing container for easier debugging by changing its image, for example to busybox, or its command, for example, to sleep 1d so you have time to kubectl exec.
    • Inspect configuration files on a node’s host filesystem

    EndpointSlices are now enabled by default

    EndpointSlices are an exciting new API that provides a scalable and extensible alternative to the Endpoints API. EndpointSlices track IP addresses, ports, readiness, and topology information for Pods backing a Service.

    In Kubernetes 1.19 this feature will be enabled by default with kube-proxy reading from EndpointSlices instead of Endpoints. Although this will mostly be an invisible change, it should result in noticeable scalability improvements in large clusters. It will also enable significant new features in future Kubernetes releases like Topology Aware Routing.

    Ingress graduates to General Availability

    SIG Network has graduated the widely used Ingress API to general availability in Kubernetes 1.19. This change recognises years of hard work by Kubernetes contributors, and paves the way for further work on future networking APIs in Kubernetes.

    seccomp graduates to General Availability

    The seccomp (secure computing mode) support for Kubernetes has graduated to General Availability (GA). This feature can be used to increase the workload security by restricting the system calls for a Pod (applies to all containers) or single containers.

    KubeSchedulerConfiguration graduates to Beta

    SIG Scheduling graduates KubeSchedulerConfiguration to Beta. The KubeSchedulerConfiguration feature allows you to tune the algorithms and other settings of the kube-scheduler. You can easily enable or disable specific functionality (contained in plugins) in selected scheduling phases without having to rewrite the rest of the configuration. Furthermore, a single kube-scheduler instance can serve different configurations, called profiles. Pods can select the profile they want to be scheduled under via the .spec.schedulerName field.

    General ephemeral volumes

    Kubernetes provides volume plugins whose lifecycle is tied to a pod and can be used as scratch space (e.g. the builtin “empty dir” volume type) or to load some data in to a pod (e.g. the builtin ConfigMap and Secret volume types or “CSI inline volumes”). The new generic ephemeral volumes alpha feature allows any existing storage driver that supports dynamic provisioning to be used as an ephemeral volume with the volume’s lifecycle bound to the Pod.

    • It can be used to provide scratch storage that is different from the root disk, for example persistent memory, or a separate local disk on that node.
    • All StorageClass parameters for volume provisioning are supported.
    • All features supported with PersistentVolumeClaims are supported, such as storage capacity tracking, snapshots and restore, and volume resizing.

    Immutable Secrets and ConfigMaps (beta)

    Secret and ConfigMap volumes can be marked as immutable, which significantly reduces load on the API server if there are many Secret and ConfigMap volumes in the cluster. See ConfigMap and Secret for more information.

    Increase the Kubernetes support window to one year

    As of Kubernetes 1.19, bugfix support via patch releases for a Kubernetes minor release has increased from 9 months to 1 year.

    kvm-operator 3.16.0

    Added

    • Add vertical pod autoscaler configuration.
    • Automatically delete WC node pods when NotReady for too long (per-cluster opt-in only).

    Changed

    • Do not drain node pods when cluster is being deleted to improve deletion time and deadlocks.
    • Update for Kubernetes 1.19 compatibility.
    • Update k8s-kvm to v0.4.1 with QEMU v5.2.0 and Flatcar DNS fix.
    • Update k8scloudconfig to use calico-crd-installer.

    Fixed

    • Use managed-by label to check node deployments are deleted before cluster namespace.
    • Remove IPs from endpoints when the corresponding workload cluster node is not ready.

    app-operator 3.2.1

    Security

    • Restrict ingress to only expose the status endpoint.

    chart-operator 2.12.0

    Added

    • Pause Chart CR reconciliation when it has chart-operator.giantswarm.io/paused=true annotation.

    Changed

    • Set docker.io as the default registry.
    • Pass RESTMapper to helmclient to reduce the number of REST API calls.
    • Updated Helm to v3.5.3.
    • Updating namespace metadata using namespaceConfig in Chart CRs.
    • Deploy giantswarm-critical PriorityClass when it’s not found.

    coredns 1.4.1

    Changed

    • Set docker.io as the default registry
    • Update coredns to upstream version 1.8.0.
    • Added monitoring annotations and common labels.

    net-exporter 1.10.0

    Changed

    • Add label selector for pods to help lower memory usage.
  • This release upgrades QEMU to version 5.2.0 which results in scheduling improvements and better CPU limits enforcement.

    Change details

    kvm-operator 3.14.2

    Changed

    • Use k8s-kvm:0.4.1 with QEMU 5.2.0.

    app-operator 3.2.0

    Added

    • Include apiVersion, restrictions.compatibleProviders in appcatalogentry CRs.

    Changed

    • Limit the number of AppCatalogEntry per app.
    • Delete legacy finalizers on app CRs.
    • Reconciling appCatalog CRs only if pod is unique.

    Fixed

    • Updating status as cordoned if app CR has cordoned annotation.

    cluster-operator 0.24.2

    Changed

    • Migrate to Go modules.
    • Update certs package to v2.0.0.
    • Refactor to use slightly newer dependency versions.

    cert-operator 1.0.1

    Fixed

    • Add list permission for cluster.x-k8s.io.

    chart-operator 2.9.0

    Added

    • Use diff key when logging differences between the current and desired release.

    Fixed

    • Stop updating Helm release if it has failed the previous 5 attempts.
  • Nodes will be rolled when upgrading to this version.

    This patch release mitigates a DNS issue affecting cluster creation and scaling.

  • Nodes will be rolled when upgrading to this version.

    This patch release mitigates a DNS issue affecting cluster creation and scaling.

  • This release upgrades Kubernetes to 1.18 and Calico to 3.15. It also includes other minor component updates summarized below.

    Change details

    kubernetes 1.18.12

    Kubernetes 1.18 includes a large number of features, deprecations, and bug fixes. We have attempted to summarize the changes relevant to Giant Swarm customers in the following section. The full list of changes can be viewed here.

    Please discuss with your Giant Swarm solution engineer if you are unsure about a cluster’s Kubernetes 1.18 upgrade readiness.

    kube-apiserver

    • The following features are unconditionally enabled and the corresponding --feature-gates flags have been removed: PodPriority, TaintNodesByCondition, ResourceQuotaScopeSelectors and ScheduleDaemonSetPods.
    • the following deprecated APIs can no longer be served:
      • All resources under apps/v1beta1 and apps/v1beta2 - use apps/v1 instead
      • daemonsets, deployments, replicasets resources under extensions/v1beta1 - use apps/v1 instead
      • networkpolicies resources under extensions/v1beta1 - use networking.k8s.io/v1 instead
      • podsecuritypolicies resources under extensions/v1beta1 - use policy/v1beta1 instead

    kube-scheduler

    • The scheduling_duration_seconds summary metric is deprecated.
    • The scheduling_algorithm_predicate_evaluation_seconds and scheduling_algorithm_priority_evaluation_seconds metrics are deprecated, replaced by framework_extension_point_duration_seconds[extension_point="Filter"] and framework_extension_point_duration_seconds[extension_point="Score"].
    • AlwaysCheckAllPredicates is deprecated in scheduler Policy API.

    kubelet

    • --enable-cadvisor-json-endpoints is now disabled by default. If you need access to the cAdvisor v1 Json API please enable it explicitly in the kubelet command line. Please note that this flag was deprecated in 1.15 and will be removed in 1.19.
    • The StreamingProxyRedirects feature and --redirect-container-streaming flag are deprecated, and will be removed in a future release. The default behavior (proxy streaming requests through the kubelet) will be the only supported option. If you are setting --redirect-container-streaming=true, then you must migrate off this configuration. The flag will no longer be able to be enabled starting in v1.20. If you are not setting the flag, no action is necessary.
    • resource metrics endpoint /metrics/resource/v1alpha1 as well as all metrics under this endpoint have been deprecated. Please convert to the following metrics emitted by endpoint /metrics/resource:
      • scrape_error –> scrape_error
      • node_cpu_usage_seconds_total –> node_cpu_usage_seconds
      • node_memory_working_set_bytes –> node_memory_working_set_bytes
      • container_cpu_usage_seconds_total –> container_cpu_usage_seconds
      • container_memory_working_set_bytes –> container_memory_working_set_bytes
      • scrape_error –> scrape_error
    • In a future release, kubelet will no longer create the CSI NodePublishVolume target directory, in accordance with the CSI specification. CSI drivers may need to be updated accordingly to properly create and process the target path.

    calico 3.15.3

    Changes

    • Add FelixConfiguration parameters to explicitly allow encapsulated packets from workloads.
    • Respect explicit configuration for drop rules for encapsulated packets originating from workloads.

    Bug fixes

    • Added monitor-addresses option to calico-node to continually monitor IP addresses.
    • Fix issue with service IP advertisement breaking host service connectivity.
    • Felix FV tests now run with Go’s race detector enabled and a couple of low-impact data races have been fixed.
    • Fix config inheritance so that the BPF kernel version check takes precedence over environment variables.
    • In BPF mode, fix spurious “Failed to run bpftool” logs.
    • Fixed capitalization of WireGuard interfaceIPv4Address (was interfaceIpv4Address).
    • Fix race condition during block affinity deletion.

    Other changes

    • Handle panics in the CNI plugin more gracefully cni-plugin.
    • Remove unnecessary packages from docker image to address CVEs.
    • By default, exclude cni.* from node IP auto detection.
    • Added conditional check for FELIX_HEALTHHOST env variable node.
    • The Typha port is now included in the failsafe port lists by default.
    • Felix can now run in active/passive modes.
    • For NetworkPolicy and GlobalNetworkPolicy, the use of floating point values for the spec.Order field is now deprecated, and will be removed entirely in a future release. Please update your policies to use integer values for ordering.
    • Update included CustomResourceDefinitions to use the apiextensions/v1 API group and version, and include schemas for basic validation.
    • Improve scaling characteristics when using host-local IPAM - perform fewer List API calls.
    • Network policy now has the global() namespace selector which selects host endpoints or global network sets.
    • Program blackhole routes for full rejectcidrs to avoid route loops.
    • install-cni.sh now also fails if calico -v doesn’t work after copying the calico binary cni-plugin.
    • Upstream CNI plugins updated to v0.8.6.

    kvm-operator 3.14.0

    Added

    • Roll nodes when versions of calico, containerlinux, etcd, kubernetes change in release and kvm-operator version is unchanged.

    Changed

    • Update Kubernetes libraries to 1.18 along with all other client-go-dependent libraries.
    • Use InternalIP from TC node’s status instead of label for dead endpoints detection.
    • Shorten calico-node wait timeout in k8s-addons and add retry for faster cluster initialization.
    • Remove unused Kubernetes scheduler configuration fields preventing strict YAML unmarshalling.

    etcd 3.4.13

    Security

    • A log warning is added when etcd use any existing directory that has a permission different than 700 on Linux and 777 on Windows.

    Package clientv3

    Package runtime

    Metrics, Monitoring

    Package etcd server

    Breaking Changes

    • Changed behavior on existing dir permission. Previously, the permission was not checked on existing data directory and the directory used for automatically generating self-signed certificates for TLS connections with clients. Now a check is added to make sure those directories, if already exist, has a desired permission of 700 on Linux and 777 on Windows.

    kube-state-metrics 1.3.0

    Added

    • Added monitoring annotations and common labels.

    Changed

    • Deploy kube-state-metrics-app on installations as part of app collection.

    app-operator 2.7.0

    Changed

    • Update apiextensions to v3 and replace CAPI with Giant Swarm fork.

    Fixed

    • Use resourceVersion of configmap for comparison instead of listing option.

    Added

    • Secure the webhook with token value from control plane catalog.
    • Adding webhook URL as annotation into chart CRs.
    • Added Status update endpoint.
    • Watch secrets referenced in app CRs to reduce latency when applying config changes.
    • Create appcatalogentry CRs for public app catalogs.
    • Watch configmaps referenced in app CRs to reduce latency when applying config changes.

    chart-operator 2.5.0

    Added

    • Validate the cache in helmclient to avoid state requests when pulling tarballs.
    • Call status webhook with token values.

    Fixed

    • Update apiextensions to v3 and replace CAPI with Giant Swarm fork.

    cert-exporter 1.3.0

    Added

    • Add Network Policy.

    Changed

    • Remove hostNetwork and hostPID capabilities.

    net-exporter 1.9.2

    Changed

    • Updated backward incompatible Kubernetes dependencies to v1.18.5.

    Fixed

    • Fixed indentation problem with the daemonset template.

    node-exporter 1.7.0

    Changed

    • Disable btrfs,softnet,rapl and thermal_zone to reduce memory usage.
    • Increase memory limit to 75Mi.
  • Nodes will be rolled during upgrade to this version.

    This patch release adds registry credentials to prevent an issue with image pulling rate limits recently introduced by Docker Hub.

    Note before upgrade:

    Please contact your Solution Engineer before upgrading. The upgrade is automated. However, it includes a data migration from Helm 2 release configmaps to Helm 3 release secrets, there are some pre-upgrade checks and we recommend monitoring the upgrade to ensure safety.

    Note for Solution Engineers:

    Before upgrading, please ensure cluster is on KVM 12.1.x platform release first.

    Please use Upgrading tenant clusters to Helm 3 as a guide on the upgrade process for the checks and monitoring steps.

    Note for future 12.x.x releases:

    Please ensure cluster is on KVM 12.1.x platform release first before upgrading to 12.2.0+ Please persist this note until all customers are on KVM 12.1.x and above.

    Change details

    kvm-operator v3.13.0

    • Update k8scloudconfig to v7.2.0, containing a fix for DockerHub QPS.
  • If you are upgrading from 12.3.0, upgrading to this release will not roll your nodes.

    This patch release fixes a problem causing the accidental deletion and reinstallation of Preinstalled Apps (such as CoreDNS) in 12.x.x tenant clusters.

    Please upgrade all older clusters to this version in order to prevent possible downtime.

    Note before upgrade:

    Please contact your Solution Engineer before upgrading. The upgrade is automated. However, it includes a data migration from Helm 2 release configmaps to Helm 3 release secrets, there are some pre-upgrade checks and we recommend monitoring the upgrade to ensure safety.

    Note for Solution Engineers:

    Before upgrading, please ensure cluster is on KVM 12.1.x platform release first.

    Please use Upgrading tenant clusters to Helm 3 as a guide on the upgrade process for the checks and monitoring steps.

    Note for future 12.x.x releases:

    Please ensure cluster is on KVM 12.1.x platform release first before upgrading to 12.2.0+ Please persist this note until all customers are on KVM 12.1.x and above.

    Change details

    cluster-operator 0.23.18

    • Remove all chartconfig migration logic that caused accidental deletion and is no longer needed.

    app-operator 2.3.5

    • Fix YAML comparison for chart configmaps and secrets.
  • This release upgrades all Helm releases managed by Giant Swarm to use Helm v3.3.4.

    This lets us benefit from the improved security model and keep up to date with the community. We also remove the Tiller deployment from the giantswarm namespace, removing its gRPC endpoint, which reduces operational complexity.

    If you are still using Helm 2 then these Helm releases will not be affected. However we encourage you to upgrade to Helm 3 as Helm 2 support ends on November 13th 2020. https://helm.sh/blog/helm-v2-deprecation-timeline/

    The release also upgrades Container Linux to address security vulnerabilities.

    Below, you can find more details on components that were changed with this release.

    Note before upgrade:

    Please contact your Solution Engineer before upgrading. The upgrade is automated. However, it includes a data migration from Helm 2 release configmaps to Helm 3 release secrets, there are some pre-upgrade checks and we recommend monitoring the upgrade to ensure safety.

    Note for Solution Engineers:

    Before upgrading, please ensure cluster is on KVM 12.1.x platform release first.

    Please use Upgrading tenant clusters to Helm 3 as a guide on the upgrade process for the checks and monitoring steps.

    Note for future 12.x.x releases:

    Please ensure cluster is on KVM 12.1.x platform release first before upgrading to 12.2.0+ Please persist this note until all customers are on KVM 12.1.x and above.

    Change details

    app-operator v2.3.4

    chart-operator v2.3.5

    kvm-operator v3.12.2

    • Added monitoring labels into prometheus metrics

    containerlinux 2512.5.0

    Security fixes:

    Changes:

    • Update public key to include a new subkey
    • Vultr support in Ignition (flatcar-linux/ignition#13)
    • VMware OVF settings default to ESXi 6.5 and Linux 3.x

    Updates:

  • If you are upgrading from 12.1.0, upgrading to this release will not roll your nodes.

    As of this release, NGINX Ingress Controller App is now an optional (and not pre-installed) component on KVM.

    This enables use of alternative ingress controllers without wasting resources where NGINX is not the preferred option.

    Now NGINX App installations can be managed and updated independent of the cluster, which is both a benefit and a new responsibility 😅

    Upgrading tenant clusters with pre-installed NGINX will leave NGINX unchanged. Existing NGINX App custom resources will still have giantswarm.io/managed-by: cluster-operator label, but it will be ignored. The label will be cleaned up later after all tenant clusters have been upgraded and KVM platform releases older than v12.2.0 archived.

    Note for cluster upgrades:

    Please ensure cluster is on KVM 12.1.x platform release first before upgrading the cluster to 12.2.0+

    Below, you can find more details on components that were changed with this release.

    cluster-operator 0.23.14

    • Support for making NGINX IC App optional and not pre-installed.
  • This release includes two significant improvements to NGINX Ingress Controller. It also includes a fix for Quay being a single point of failure by using Docker mirroring feature. This ensures availability of all images needed for node bootstrap, thus the cluster creation/scaling doesn’t depend on Quay availability anymore.

    The two NGINX Ingress Controller improvements:

    • Multiple NGINX Ingress Controllers per tenant cluster are now supported, enabling separation of internal vs external traffic, dev vs prod, and so on.
    • Management of NGINX IC NodePort Service is moved from kvm-operator to NGINX IC App itself. This enables configurability of external traffic policy and lays the foundation for making NGINX IC App optional and not pre-installed in a future KVM platform release.

    Along with kvm-operator, cluster-operator and NGINX IC, release includes several upstream component upgrades.

    Note for cluster upgrades:

    Please manually delete nginx-ingress-controller NodePort Service in kube-system namespace. Upgrading the cluster then recreates the NodePort Service and moves its management from kvm-operator to NGINX IC. To minimize downtime, please delegate cluster upgrades to your SE.**

    Note for future 12.1.x releases:

    To prevent downtime, please persist this and the above note until all customers are on 12.1.0 and above.

    Below, you can find more details on components that were changed with this release.

    cluster-operator 0.23.13

    • Enable NGINX App managed NodePort Service on KVM.

    kube-state-metrics v1.9.7 (Giant Swarm app v1.1.1)

    • Updated kube-state-metrics version from 1.9.5 to 1.9.7. Check the upstream changelog for details on all changes.

    kvm-operator v3.12.1

    • Add registry mirrors support.
    • Stop provisioning NGINX IC NodePort Service.

    metrics-server v0.3.6 (Giant Swarm app v1.1.1)

    • Updated metrics-server version from 0.3.3 to 0.3.6. Check the upstream changelog for details on all changes.

    nginx-ingress-controller v0.34.1 (Giant Swarm app v1.8.1)

    • Support multiple NGINX IC App installations per tenant cluster.
    • Made NGINX NodePort Service external traffic policy configurable.
    • Made NGINX NodePort Service node ports configurable.
    • Drop support for deprecated configuration properties.

    node-exporter v1.0.1 (Giant Swarm app v1.3.0)

    • Updated node-exporter version from 0.18.1 to 1.0.1. Check the upstream changelog for details on all changes.