Workload cluster releases for AWS

  • This release reintroduces tagging private subnets on node pools to enable autodiscovery for internal ELBs by setting the annotation alpha.aws.giantswarm.io/internal-elb: "" on AWSMachineDeployment CR’s.

    Change details

    aws-operator 11.10.0

    Added

    • Set optionally the kubernetes.io/role/internal-elb tag to machine deployment subnets.
  • This release fixes a bug in AWS CNI when external SNAT is enabled.

    Change details

    aws-operator 11.9.3

    Fixed

    • Set AWS_VPC_K8S_CNI_RANDOMIZESNAT to prng when SNAT is enabled.
  • This release adds support for Node Local DNS Cache. It also provides stability improvements, bug fixes and security fixes for various components.

    Highlights

    • Node Local DNS Cache. It can be enabled by installing the k8s-dns-node-cache-app from the Playground catalog;
    • Tag IAM Roles for Service Accounts AWS resources. See roadmap issue;
    • Enable encryption for the S3 bucket used by IAM Roles for Service Accounts. See roadmap issue;
    • VPA configuration for kube-proxy;
    • Security fixes:
      • 18 SDK: QEMU CVEs;
      • 8 vim CVEs;
      • 4 SDK: edk2-ovmf CVEs;
      • 3 SDK: mantle CVEs;
      • 2 Linux CVEs;
      • 2 containerd CVEs;
      • 2 Ignition CVEs;
      • 2 Go CVEs;
      • 2 libarchive CVEs;
      • 2 torcx CVEs;
      • 1 OpenSSH CVE;
      • 1 openssl CVE;
      • 1 gcc CVE;
      • 1 krb5 CVE;
      • 1 SDK: libxslt CVE;
      • 1 SDK: Rust.

    Change details

    aws-ebs-csi-driver 2.12.0

    Added

    • Allow specifying driverMode for the controller component.
    • Also push to control-plane app catalog.

    Changed

    • Allow specifying nodeSelector and hostNetwork for controller and node.
    • Bump aws-ebs-csi-driver version to v1.5.1.

    aws-operator 11.9.2

    Added

    • Added separate service account flag for IRSA.
    • Add POD_SECURITY_GROUP_ENFORCING_MODE to aws-node Daemonset.
    • Added latest flatcar images.

    Fixed

    • Issuer S3 endpoint for IRSA.
    • AWS Region Endpoint for IRSA.
    • Ignore S3 bucket deletion for audit logs.

    Removed

    • Remove tag kubernetes.io/role/internal-elb from machine deployment subnets.

    Changed

    • Bumped k8scc to 13.4.0 to enable VPA for kube-proxy.

    aws-cni 1.11.0

    Upgraded from version 1.10.2. Please check upstream changelog for details.

    cert-operator 2.0.1

    Fixed

    • Bump go module major version.

    cluster-operator 4.0.2

    Fixed

    • List apps by namespace.

    containerlinux 3139.2.0

    New Stable Release 3139.2.0

    Changes since Stable 3033.2.4

    Security fixes:

    Bug fixes:

    • Excluded the Kubenet cbr0 interface from networkd’s DHCP config and set it to Unmanaged to prevent interference and ensure that it is not part of the network online check (init#55)
    • Fixed the dracut emergency Ignition log printing that had a scripting error causing the cat command to fail (bootengine#33)
    • network: Accept ICMPv6 Router Advertisements to fix IPv6 address assignment in the default DHCP setting (init#51, coreos-cloudinit#12, bootengine#30)
    • flatcar-update: Stopped checking for the USER environment variable which may not be set in all environments, causing the script to fail unless a workaround was used like prepending an additional sudo invocation (init#58)
    • Reverted the Linux kernel commit which broke networking on AWS instances which use Intel 82559 NIC (c4/m4) (Flatcar#665, coreos-overlay#1723)
    • Re-added the brd drbd nbd rbd xen-blkfront zram libarc4 lru_cache zsmalloc kernel modules to the initramfs since they were missing compared to the Flatcar 3033.2.x releases where the 5.10 kernel is used (bootengine#40)

    Changes:

    • Added a new flatcar-update tool to the image to ease manual updates, rollbacks, channel/release jumping, and airgapped updates (init#53)
    • Update-engine now creates the /run/reboot-required flag file for kured (update_engine#15)
    • Excluded special network interface devices like bridge, tunnel, vxlan, and veth devices from the default DHCP configuration to prevent networkd interference (init#56)
    • Added CONFIG_NF_CT_NETLINK_HELPER (for libnetfilter_cthelper), CONFIG_NET_VRF (for virtual routing and forwarding) and CONFIG_KEY_DH_OPERATIONS (for keyutils) to the kernel config (coreos-overlay#1524)
    • Enabled the FIPS support for the Linux kernel, which users can now choose through a kernel parameter in grub.cfg (check it taking effect with cat /proc/sys/crypto/fips_enabled) (coreos-overlay#1602)
    • Enabled FIPS mode for cryptsetup (portage-stable#312)
    • Rework the way we set up the default python intepreter in SDK - it is now without specifying a version. This should work fine as long as we keep having one version of python in SDK.
    • Add a way to remove packages that are hard-blockers for update. A hard-blocker means that the package needs to be removed (for example with emerge -C) before an update can happen.
    • Removed the pre-shipped /etc/flatcar/update.conf file, leaving it totally to the user to define the contents as it was unnecessarily overwriting the /use/share/flatcar/update.conf (scripts#212)

    Updates:

    Changes since Beta 3139.1.1

    Security fixes:

    Changes:

    Updates:

    app-operator 5.9.0

    Changed

    • Update helmclient to v4.10.0.
    • Update giantswarm/appcatalog to v0.7.0, adding support for internal OCI chart catalogs.

    Added

    • Add support for relative URLs in catalog indexes.
    • Annotate App CRs after bootstrapping chart-operator to trigger reconciliation.

    Fixed

    • Continue processing AppCatalogEntry CRs if an error occurs.
    • Only show AppCatalogEntry CRs that are compatible with the current provider.
    • For internal catalogs generate tarball URLs instead of checking index.yaml to prevent chicken egg problems in new clusters.
    • Fix label selector in app values watcher so it supports CAPI clusters.
    • Strip cluster name from App CR name to determine Chart CR name in chart/current.go resource to fix WC app updates.
    • Allow usage of chart-operator PSP so it can be bootstrapped.
    • Fixing patch to not reset fields.
    • Remove compatible providers validation for AppCatalogEntry as its overly strict.
    • Push image to Docker Hub to not rely on crsync.
    • Restrict PSP usage to only named resource.

    cert-exporter 2.2.0

    Changed

    • Change priorityClass to system-node-critical for the daemonset.
    • Make exporter’s monitor flags configurable.

    Fixed

    • Allow egress to port 1053 to make in-cluster DNS queries work.
    • Allow egress to port 443 to allow accessing vault.

    cert-manager 2.13.0

    Changed

    • Use retagged container image for HTTP01 AcmeSolver (#212)
    • Pin kubectl to 1.23.3 in crd-install and clusterissuer-install jobs (#216)
    • Add application.giantswarm.io/team to default labels (#224).

    chart-operator 2.21.0

    Changed

    • Update helmclient to v4.10.0.

    cluster-autoscaler 1.22.2-gs6

    Added

    • Support cloud provider alias names (GCP -> GCE).

    Fixed

    • Updated to correct cluster-autoscaler version.
    • Use GS-built 1.22 image to deliver upstream unreleased fix.

    coredns 1.9.0

    Added

    • Add toleration for node.cloudprovider.kubernetes.io/uninitialized.

    Changed

    • Update coredns to upstream version 1.8.7.

    external-dns 2.9.1

    Changed

    • Allow setting the AWS default region (aws.region) indepentent from any other value.

    kiam-watchdog 0.7.0

    Added

    • Add PriorityClassName.

    kubernetes 1.22.9

    Bug or Regression

    • Fixed a regression that could incorrectly reject pods with OutOfCpu errors if they were rapidly scheduled after other pods were reported as complete in the API. The Kubelet now waits to report the phase of a pod as terminal in the API until all running containers are guaranteed to have stopped and no new containers can be started. Short-lived pods may take slightly longer (~1s) to report Succeeded or Failed after this change. (#108749, @bobbypage) [SIG Apps, Node and Testing]
    • Fixes error handling in a kubectl method used in downstream packages. (#108520, @heybronson) [SIG CLI]

    Dependencies

    Added

    Nothing has changed.

    Changed

    Nothing has changed.

    Removed

    Nothing has changed.

    kube-state-metrics 1.10.0

    Changed

    • Make --metric-labels-allowlist configurable through user values.
    • Add Node Pool labels to the default allowed labels in --metric-labels-allowlist.
    • Allow giantswarm.io/service-type labels from kube__labels (Deployment, DaemonSet, StatefulSet).

    net-exporter 1.12.0

    Changed

    • Use parameter for CoreDNS namespace (defaulted to kube-system)

    vertical-pod-autoscaler 2.1.2

    Fixed

    • Fixed default value for admission controller PDB.

    vertical-pod-autoscaler-crd 1.0.1

    Added

    • Add cluster singleton restriction so app can only be installed once.
  • This release improves the performance of etcd by using gp3 volumes with provisioned IOPS. It also provides stability improvements, bug fixes and security fixes for various components.

    Highlights

    Note when upgrading from v16 to v17: Existing Vertical Pod Autoscaler app installations need to be removed from the workload cluster prior to upgrading to v17 because the Vertical Pod Autscaler is provided as a default application. The two applications have different names which leads to them fighting each other.

    Change details

    aws-operator 11.1.0

    Added

    • Add annotation to ASG to make cluster-autoscaler work when scaling from zero replicas.

    Changed

    • Update CAPI dependencies.
    • Allow resource limits/requests to be passed as values.
    • Switch gp2 to gp3 volumes.
    • Allow etcd volume IOPS and Throughput to be set.

    cluster-operator 4.0.1

    Changed

    • Update CAPI dependencies.

    Fixed

    • Only list apps from cluster namespace.

    kubernetes 1.22.8

    API Change

    • Fixes a regression in v1beta1 PodDisruptionBudget handling of “strategic merge patch”-type API requests for the selector field. Prior to 1.21, these requests would merge matchLabels content and replace matchExpressions content. In 1.21, patch requests touching the selector field started replacing the entire selector. This is consistent with server-side apply and the v1 PodDisruptionBudget behavior, but should not have been changed for v1beta1. (#108141, @liggitt) [SIG Auth and Testing]

    Feature

    • Kubernetes is now built with Golang 1.16.15 (#108564, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    Bug or Regression

    • Bump sigs.k8s.io/apiserver-network-proxy/konnectivity-client to v0.0.30, fixing goroutine leaks in kube-apiserver. (#108439, @andrewsykim) [SIG API Machinery, Auth and Cloud Provider]
    • Fix static pod restarts in cases where the container is not present. (#108189, @rphillips) [SIG Node]
    • Fixes a bug where a partial EndpointSlice update could cause node name information to be dropped from endpoints that were not updated. (#108202, @robscott) [SIG Network]
    • Fixes a regression in the kubelet restarting static pods. (#108303, @rphillips) [SIG Node and Testing]
    • Increase Azure ACR credential provider timeout (#108209, @andyzhangx) [SIG Cloud Provider]
    • Fix Azurefile volumeid collision issue in csi migration (#107575, @andyzhangx) [SIG Cloud Provider and Storage]
    • Fix: delete non existing Azure disk issue (#107406, @andyzhangx) [SIG Cloud Provider]
    • Fix: ignore the case when comparing azure tags in service annotation (azure) (#107580, @nilo19) [SIG Cloud Provider]

    Dependencies

    Added

    Nothing has changed.

    Changed

    • sigs.k8s.io/apiserver-network-proxy/konnectivity-client: v0.0.27 → v0.0.30
    • k8s.io/utils: bdf08cb → 6203023

    Removed

    Nothing has changed.

    containerlinux 3033.2.4

    New Stable Release 3033.2.4

    Changes since Stable-3033.2.3

    Security fixes

    Bug fixes

    • Reverted the Linux kernel commit which broke networking on AWS instances which use Intel 82559 NIC (c4/m4) (Flatcar#665, coreos-overlay#1720)
    • Disabled the systemd-networkd settings ManageForeignRoutes and ManageForeignRoutingPolicyRules by default to ensure that CNIs like Cilium don’t get their routes or routing policy rules discarded on network reconfiguration events (Flatcar#620).
    • Prevented hitting races when creating filesystems in Ignition, these races caused boot failures like fsck[1343]: Failed to stat /dev/disk/by-label/ROOT: No such file or directory when creating a btrfs root filesystem (ignition#35)
    • Reverted the Linux kernel change to forbid xfrm id 0 for IPSec state because it broke Cilium (Flatcar#626, coreos-overlay#1682)

    Changes

    • Added support for switching back to CGroupsV1 without requiring a reboot. Create /etc/flatcar-cgroupv1 through ignition. (coreos-overlay#1666)

    Updates

    kiam 2.3.0

    Added

    • Add VerticalPodAutoscaler CR.

    kiam-watchdog 0.6.0

    Added

    • Add VerticalPodAutoscaler CR.

    aws-ebs-csi-driver 2.9.0

    Added

    • Add VerticalPodAutoscaler CR.

    chart-operator 2.20.1

    Changed

    • Use apptestctl to install CRDs in integration tests to avoid hitting GitHub rate limits.

    Fixed

    • Fix status resource to use Helm release status if it exists.

    cert-operator 2.0.0

    Changed

    • Use v1beta1 CAPI CRDs.
    • Bump giantswarm/apiextensions to v6.0.0.
    • Bump giantswarm/exporterkit to v1.0.0.
    • Bump giantswarm/microendpoint to v1.0.0.
    • Bump giantswarm/microerror to v0.4.0.
    • Bump giantswarm/microkit to v1.0.0.
    • Bump giantswarm/micrologger to v0.6.0.
    • Bump giantswarm/k8sclient to v7.0.1.
    • Bump giantswarm/operatorkit to v7.0.1.
    • Bump k8s dependencies to v0.22.2.
    • Bump controller-runtime to v0.10.3.
    • Use apptestctl to install CRDs in integration tests to avoid hitting GitHub rate limits.
  • This release introduces IAM roles for service accounts (IRSA) as an alternative to Kiam. More details are available in the documentation.

    Warning: IAM roles for service accounts requires the following additional permissions to be granted:

    • iam:CreateOpenIDConnectProvider
    • iam:DeleteOpenIDConnectProvider
    • iam:TagOpenIDConnectProvider
    • iam:UntagOpenIDConnectProvider
    • s3:PutObjectAcl

    All the AWS prerequisites are available in the giantswarm-aws-account-prerequisites repository.

    Note when upgrading from v16 to v17: Existing Vertical Pod Autoscaler app installations need to be removed from the workload cluster prior to upgrading to v17 because the Vertical Pod Autscaler is provided as a default application. The two applications have different names which leads to them fighting each other.

    Change details

    cluster-operator 3.14.1

    Added

    • Add IAM Roles for Service Accounts feature support for AWS.

    Changed

    • Update aws-pod-identity-webhook app version.

    aws-operator 10.18.0

    Added

    • Add support for IAM Roles for Service Accounts feature.

    net-exporter 1.11.0

    Added

    • Add networkpolicy to allow egress towards k8s-dns-node-cache-app endpoints.

    kiam 2.2.0

    Changed

    • Updated whiteListRouteRegexp to default to /latest/meta-data/placement/availability-zone

    Fixed

    • Merged two release workflows into one to handle both tags

    Added

    • Build script to generate an IRSA compatible version of each release
  • This release downgrades the version of the Flatcar AMI from 3033.2.2 to 3033.2.0 due to a bug in version 3033.2.1 -> 3033.2.3 preventing successful boot on some EC2 instance type families. (Notably the m4 instance types)

    Note when upgrading from v16 to v17: Existing Vertical Pod Autoscaler app installations need to be removed from the workload cluster prior to upgrading to v17 because the Vertical Pod Autscaler is provided as a default application. The two applications have different names which leads to them fighting each other.

    Change details

    containerlinux 3033.2.0

  • This release allows one replica of coredns to run on the control plane nodes for clusters without any node pools.

    Change details

    kubernetes 1.21.9

    Feature

    • Kube-apiserver: when merging lists, Server Side Apply now prefers the order of the submitted request instead of the existing persisted object (#107569, @jiahuif) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Storage and Testing]

    Bug or Regression

    • An inefficient lock in EndpointSlice controller metrics cache has been reworked. Network programming latency may be significantly reduced in certain scenarios, especially in clusters with a large number of Services. (#107169, @robscott) [SIG Apps and Network]
    • Client-go: fix that paged list calls with ResourceVersionMatch set would fail once paging kicked in. (#107336, @fasaxc) [SIG API Machinery]
    • Fix a panic when using invalid output format in kubectl create secret command (#107345, @rikatz) [SIG CLI]
    • Fixes a rare race condition handling requests that timeout (#107460, @liggitt) [SIG API Machinery]
    • Mount-utils: Detect potential stale file handle (#107040, @andyzhangx) [SIG Storage]

    Other (Cleanup or Flake)

    • Updates konnectivity-network-proxy to v0.0.27. This includes a memory leak fix for the network proxy (#107188, @rata) [SIG API Machinery and Cloud Provider]

    Dependencies

    Added

    Nothing has changed.

    Changed

    • github.com/google/cadvisor: v0.39.0 → v0.39.3
    • sigs.k8s.io/apiserver-network-proxy/konnectivity-client: v0.0.22 → v0.0.27
    • sigs.k8s.io/structured-merge-diff/v4: v4.1.2 → v4.2.1

    Removed

    Nothing has changed.

    coredns 1.8.0

    Changed

    • Add deployment to run one replica of coredns in master nodes (for clusters with no node pools).
  • This release allows one replica of coredns to run on the control plane nodes for clusters without any node pools.

    Warning: Kubernetes v1.22 removed certain APIs and features. More details are available in the upstream blog post.

    Known Issues

    • Java applications are unable to identify memory limits when using a JRE prior to v15 in a Control Groups v2 environment. Support was added in JRE v15 and later. More details are available in the upstream issue. We recommend using the latest LTS JRE available (currently v17) to ensure continued compatibility with future releases.

    Control Groups v1 To ensure a smooth transition, in case you need time to modify applications to make them compatible with Control Groups v2, we provide a mechanism that will allow using Control Groups v1 on specific node pools. More details are available in our documentation.

    Note when upgrading from v16 to v17: Existing Vertical Pod Autoscaler app installations need to be removed from the workload cluster prior to upgrading to v17 because the Vertical Pod Autscaler is provided as a default application. The two applications have different names which leads to them fighting each other.

    Change details

    coredns 1.8.0

    Changed

    • Add deployment to run one replica of coredns in master nodes (for clusters with no node pools).
  • This release provides a fix for aws-ebs-csi-driver to ensure all taints for ebs-node are tolerated as well as selecting the right node selector for all nodes.

    Change details

    aws-ebs-csi-driver 2.8.1

    Fixed

    • Use node selector according to control-plane and nodepool labels.
  • This release provides support for Kubernetes 1.22, has Control Groups v2 enabled by default and includes the Vertical Pod autoscaler.

    Highlights

    • Kubernetes 1.22 support;
    • Control Groups v2 are enabled by default;
    • The Vertical Pod autoscaler is included by default to help size pods for the optimal CPU and memory usage;
    • rpcbind is disabled by default to mitigate security risks. NFS v2 and v3 are not supported anymore;
    • ebs-csi-node tolerates all custom taints;
    • Security fixes:
      • 44 Linux CVEs;
      • 10 expat;
      • 8 Go CVEs;
      • 5 glibc CVE;
      • 4 Docker CVEs;
      • 3 curl CVEs;
      • 3 vim CVEs;
      • 2 polkit CVE;
      • 2 bash CVEs;
      • 2 binutils CVEs;
      • 3 containerd CVEs;
      • 2 nettle CVEs;
      • 2 SDK: bison CVEs;
      • 1 ca-certificates CVE;
      • 1 util-linux CVE;
      • 1 git CVE;
      • 1 gnupg CVE;
      • 1 libgcrypt CVE;
      • 1 sssd CVE;
      • 1 SDK: perl CVE;

    Warning: Kubernetes v1.22 removed certain APIs and features. More details are available in the upstream blog post.

    Warning: rpcbind is disabled by default to mitigate security risks. Any application which requires it will no longer work. NFS v2 and v3 are such applications and are no longer supported. Please, check if any you have any application which depend on rpcbind before you upgrade.

    Known Issues

    • Java applications are unable to identify memory limits when using a JRE prior to v15 in a Control Groups v2 environment. Support was added in JRE v15 and later. More details are available in the upstream issue. We recommend using the latest LTS JRE available (currently v17) to ensure continued compatibility with future releases.

    Control Groups v1 To ensure a smooth transition, in case you need time to modify applications to make them compatible with Control Groups v2, we provide a mechanism that will allow using Control Groups v1 on specific node pools. More details are available in our documentation.

    Note when upgrading from v16 to v17: Existing Vertical Pod Autoscaler app installations need to be removed from the workload cluster prior to upgrading to v17 because the Vertical Pod Autscaler is provided as a default application. The two applications have different names which leads to them fighting each other.

    Change details

    kubernetes 1.22.6

    What’s New (Major Themes)

    Removal of several beta Kubernetes APIs

    A number of APIs are no longer serving specific Beta versions in favour of the GA version of those APIs. All existing objects can be interacted with via general availability APIs. This removal includes beta versions of ValidatingWebhookConfiguration, MutatingWebhookConfiguration, CustomResourceDefinition, APIService, TokenReview, SubjectAccessReview, CertificateSigningRequest, Lease, Ingress, and IngressClass APIs. For the full list check out Deprecated API Migration Guide and the blog post Kubernetes API and Feature Removals In 1.22: Here’s What You Need To Know.

    Kubernetes release cadence change

    We all have to adapt to change in our lives, and especially so in the past year. The Kubernetes release team was also affected from the COVID-19 pandemic and has listened to its user base regarding the number of releases in a calendar year. From April 23, 2021 it was made official that Kubernetes release cadence has reduced from 4 releases per year to 3 releases per year.

    You can read more in the official blog post Kubernetes Release Cadence Change: Here’s What You Need To Know.

    External credential providers

    Kubernetes client credential plugins have been in beta since 1.11, a few eons ago. With the release of Kubernetes 1.22, this feature set graduates to stable. The GA feature set includes improved support for plugins that provide interactive login flows. This release also contains a number of bug fixes to the feature set. Aspiring plugin authors can look at sample-exec-plugin as a way to get started.

    Related to this topic, the in-tree Azure and GCP authentication plugins have been deprecated in favor of out-of-tree implementations.

    Server-side Apply graduates to GA

    Server-side Apply is a new object merge algorithm, as well as tracking of field ownership, running on the Kubernetes API server. Server-side Apply helps users and controllers manage their resources via declarative configurations. It allows them to create and/or modify their objects declaratively, simply by sending their fully specified intent. After being in beta for a couple releases, Server-side Apply is now generally available.

    Cluster Storage Interface graduations

    CSI support for Windows nodes moves to GA in the 1.22 release. In Kubernetes v1.22, Windows privileged containers are only an alpha feature. To allow using CSI storage on Windows nodes, CSIProxy enables CSI node plugins to be deployed as unprivileged pods, using the proxy to perform privileged storage operations on the node.

    Another feature moving to GA in v1.22 is CSI Service Account Token support. This feature allows CSI drivers to use pods’ bound service account tokens instead of a more privileged identity. It also provides control over to re-publishing these volumes, so that short-lived tokens can be refreshed.

    SIG Windows development tools

    To grow the developer community, SIG Windows released multiple tools. The new tools support multiple CNI providers (Antrea, Calico), can run on multiple platforms (any vagrant compatible provider, such as Hyper-V, VirtualBox, or vSphere). There is also a new way to run bleeding edge Windows features from scratch by compiling the windows kubelet and kube-proxy, then using them along with daily builds of other Kubernetes components.

    Deploy a more secure control plane with kubeadm

    A new alpha feature allows running the kubeadm control plane components as non-root users. This is a long requested security measure in kubeadm. To try it you must enable the kubeadm-specific RootlessControlPlane feature gate. When you deploy a cluster using this alpha feature, your control plane runs with lower privileges.

    A new v1beta3 configuration API. It iterates over v1beta2 by adding some long requested features and deprecating some existing ones. The V1beta3 is now the preferred API version; the v1beta2 API also remains available and is not yet deprecated.

    etcd moves to version 3.5.0

    Kubernetes’ default backend storage, etcd, has a new release 3.5.0 and the community embraced it. The new release comes with improvements to the Security, performance, monitoring and developer experience. There are numerous bug fixes to lease objects causing memory leaks, and compact operation causing deadlocks and more. A couple of new features are also introduced like the migration to structured logging and build in log rotation. The release comes with a detailed future roadmap to implement a solution to traffic overload. A full and detailed list of changes can be read in the 3.5.0 release announcement.

    Kubernetes Node system swap support

    Every system administrator or Kubernetes user has been in the same boat regarding setting up and using Kubernetes: disable swap space. With the release of Kubernetes 1.22, alpha support is available to run nodes with swap memory. This change lets administrators opt in to configuring swap on Linux nodes, treating a portion of block storage as additional virtual memory.

    Cluster-wide seccomp defaults

    A new alpha feature gate SeccompDefault has been added to the kubelet, together with a corresponding command line flag --seccomp-default and kubelet configuration. If both are enabled, then the kubelet’s behavior changes for pods that don’t explicitly set a seccomp profile. With cluster-wide seccomp defaults, the kubelet uses the RuntimeDefault seccomp profile by default, rather than than Unconfined. This allows enhancing the default cluster wide workload security of the Kubernetes deployment. Security administrators will now sleep better knowing there is some security by default for the workloads.

    To learn more about the feature, please refer to the official seccomp tutorial.

    Quality of Service for memory resources

    Originally, Kubernetes used the v1 cgroups API. With that design, the QoS class for a pod only applied to CPU resources (such as cpu_shares). The Kubernetes cgroup manager uses memory.limit_in_bytes in v1 cgroups to limit the memory capacity for a container, and uses oom_scores to recommend an order for killing container processes if an out-of-memory event occurs. This implementation has shortcomings: for Guaranteed pods, memory can not be fully reserved, and the page cache is at risk of being recycled. For Burstable pods, overcommitting memory (setting request less than limit ) could increase the risk of a container being killed when the Linux kernel detects an out of memory condition.

    As an alpha feature, Kubernetes v1.22 can use the cgroups v2 API to control memory allocation and isolation. This feature is designed to improve workload and node availability when there is contention for memory resources.

    API changes and improvements for ephemeral containers

    The API used to create Ephemeral Containers changed in 1.22. The Ephemeral Containers feature is alpha and disabled by default, and the new API does not work with clients that attempt to use the old API.

    For stable features, the kubectl tool follows the Kubernetes version skew policy; however, kubectl v1.21 and older do not support the new API for ephemeral containers. Users who create ephemeral containers using kubectl debug should note that kubectl version 1.22 will attempt to fall back to the old API; older versions of kubectl will not work with cluster versions of 1.22 or later. Please update kubectl to 1.22 if you wish to use kubectl debug with a mix of cluster versions.

    Known Issues

    CPU and Memory manager are not working correctly for Guaranteed Pods with multiple containers

    A regression bug was found where guaranteed Pods with multiple containers do not work properly with set allocations for CPU, Memory, and Device manager. The fix will be availability in coming releases.

    CSIMigrationvSphere feature gate has not migrated to new CRD APIs

    If CSIMigrationvSphere feature gate is enabled, user should not upgrade to Kubernetes v1.22. vSphere CSI Driver does not support Kubernetes v1.22 yet because it uses v1beta1 CRD APIs. Support for v1.22 will be added at a later release. Check the following document for supported Kubernetes releases for a given vSphere CSI Driver version.

    Urgent Upgrade Notes

    (No, really, you MUST read this before you upgrade)
    • Audit log files are now created with a mode of 0600. Existing file permissions will not be changed. If you need the audit file to be readable by a non-root user, you can pre-create the file with the desired permissions. (#95387, @JAORMX) [SIG API Machinery and Auth]
    • CSI migration of AWS EBS volumes requires AWS EBS CSI driver ver. 1.0 that supports allowAutoIOPSPerGBIncrease parameter in StorageClass. (#101082, @jsafrane)
    • Conformance image is now built with Distroless. Users running Conformance testing should rely on container entrypoint instead of manual invocation to /run_e2e.sh or /gorunner, as they are now deprecated and will be removed in 1.25 release. Invoking ginkgo and e2e.test are still supported through overriding entrypoint (docker) or defining container spec.command (kubernetes). (#99178, @wilsonehusin)
    • Default StreamingProxyRedirects to disabled. If there is a >= 2 version skew between master and nodes, and the old nodes were enabling --redirect-container-streaming, this will break them. In this case, the StreamingProxyRedirects can still be manually enabled. (#101647, @pacoxu)
    • Intree volume plugin scaleIO support has been completely removed from Kubernetes. (#101685, @Jiawei0227)
    • Kubeadm: remove the automatic detection and matching of cgroup drivers for Docker. For new clusters if you have not configured the cgroup driver explicitly you might get a failure in the kubelet on driver mismatch (kubeadm clusters should be using the systemd driver). Also remove the IsDockerSystemdCheck preflight check (warning) that checks if the Docker cgroup driver is set to systemd. Ideally such detection / coordination should be on the side of CRI implementers and the kubelet (tracked here). Please see the page on how to configure cgroup drivers with kubeadm manually (#99647, @neolit123)
    • Kubeadm: the flag --cri-socket is no longer allowed in a mixture with the flag --config. Please use the kubeadm configuration for setting the CRI socket for a node using {Init|Join}Configuration.nodeRegistration.criSocket. (#101600, @KofClubs)
    • Newly provisioned PVs by Azure disk will no longer have the beta FailureDomain label. Azure disk volume plugin will start to have GA topology label instead. (#101534, @kassarl)
    • Scheduler’s CycleState now embeds internal read/write locking inside its Read() and Write() functions. Meanwhile, Lock() and Unlock() function are removed. Scheduler plugin developers are now required to remove CycleState#Lock() and CycleState#Unlock(). Just simply use Read() and Write() as they’re natively thread-safe now. (#101542, @Huang-Wei)
    • The CSIMigrationVSphereComplete feature flag is removed. InTreePluginvSphereUnregister will be the way moving forward. (#101272, @Jiawei0227)
    • The flag --experimental-patches is now deprecated and will be removed in a future release. You can migrate to using the new flag --patches. Add a new field {Init|Join}Configuration.patches.directory that can be used for the same purpose. For init and join it is now recommended that you migrate to configure patches via {Init|Join}Configuration.patches.directory. For the time being, these flags can be mixed with --config, but that might change in the future. On a command line, the last *patches flag takes precedence over previous flags and the value in config. kubeadm upgrade --patches will continue to be the only available option, since upgrade does not support a configuration file yet. (#103063, @neolit123)

    Important Security Information

    This release contains changes that address the following vulnerabilities:

    A security issue was discovered in Kubernetes where a user may be able to create a container with subpath volume mounts to access files & directories outside of the volume, including on the host filesystem.

    Affected Versions:

    • kubelet v1.22.0 - v1.22.1
    • kubelet v1.21.0 - v1.21.4
    • kubelet v1.20.0 - v1.20.10
    • kubelet <= v1.19.14

    Fixed Versions:

    • kubelet v1.22.2
    • kubelet v1.21.5
    • kubelet v1.20.11
    • kubelet v1.19.15

    This vulnerability was reported by Fabricio Voznika and Mark Wolters of Google.

    CVSS Rating: High (8.8) CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

    Deprecation

    • Controller-manager: the following flags have no effect and would be removed in v1.24:

      • --port
      • --address The insecure port flags --port may only be set to 0 now.

      In addtion, please be careful that:

      • controller-manager MUST start with --authorization-kubeconfig and --authentication-kubeconfig correctly set to get authentication/authorization working.
      • liveness/readiness probes to controller-manager MUST use HTTPS now, and the default port has been changed to 10257.
      • Applications that fetch metrics from controller-manager should use a dedicated service account which is allowed to access nonResourceURLs /metrics. (#96216, @knight42) [SIG API Machinery, Cloud Provider, Instrumentation and Testing]
    • Deprecate --record flag in kubectl. The --record flag is being replaced with the mechanism which annotates HTTP requests with kubectl command details. (#102873, @soltysh)

    • E2e.test: removed the --viper-config flag. If you were previously using this to pass flags to e2e.test via a file, you will need to pass them directly on the command line, e.g. e2e.test --e2e-output-dir. (#102598, @dims)

    • For kubeadm: remove the ClusterStatus API from v1beta3 and its management in the kube-system/kubeadm-config ConfigMap. This method of keeping track of what API endpoints exists in the cluster was replaced (in a prior release) by a method to annotate the etcd Pods that kubeadm creates in “stacked etcd” clusters. The following CLI sub-phases are deprecated and are now a NO-OP: for kubeadm join: “control-plane-join/update-status”, for kubeadm reset: “update-cluster-status”. Unless you are using these phases explicitly, you should not be affected. (#101915, @neolit123)

    • Kubeadm: remove the deprecated --csr-only and --csr-dir flags from kubeadm init phase certs. Deprecate the same flags under kubeadm certs renew. In both the cases the command kubeadm certs generate-csr should be used instead. (#102108, @neolit123)

    • Kubeadm: Remove the deprecated command kubeadm alpha kubeconfig. Please use kubeadm kubeconfig instead. (#101938, @knight42)

    • Kubeadm: Remove the deprecated hyperkube image support in v1beta3. This implies removal of ClusterConfiguration.UseHyperKubeImage. (#101537, @neolit123)

    • Kubeadm: Remove the field ClusterConfiguration.DNS.Type in v1beta3 since CoreDNS is the only supported DNS type. (#101547, @neolit123)

    • Kubeadm: remove the deprecated command kubeadm config view. A replacement for this command is kubectl get cm -n kube-system kubeadm-config -o=jsonpath="{.data.ClusterConfiguration}" (#102071, @neolit123)

    • Kubeadm: remove the deprecated flag ‘–image-pull-timeout’ for ‘kubeadm upgrade apply’ command (#102093, @SataQiu) [SIG Cluster Lifecycle]

    • Kubeadm: remove the deprecated flag --insecure-port from the kube-apiserver manifest that kubeadm manages. The flag had no effect since 1.20, since the insecure serving of the component was disabled in the same version. (#102121, @pacoxu)

    • Kubeadm: remove the deprecated kubeadm API v1beta1. Introduce a new kubeadm API v1beta3. See kubeadm/v1beta3 for a list of changes since v1beta2. Note that v1beta2 is not yet deprecated, but will be in a future release. (#101129, @neolit123)

    • Newly provisioned PVs by vSphere in-tree plugin will no longer have the beta FailureDomain label. vSphere volume plugin will start to have GA topology label (#102414, @divyenpatel)

    • Removal of the CSI NodePublish path by the kubelet is deprecated. This must be done by the CSI plugin according to the CSI spec. (#101441, @dobsonj)

    • Remove support for the Service topologyKeys field (alpha) and the kube-proxy implementation of it. This field was deprecated several cycles ago. This functionality is replaced by the combination of automatic topology hints per-endpoint (alpha) and the Service internalTrafficPolicy field (alpha). (#102412, @andrewsykim)

    • The PodUnknown phase is now deprecated. (#95286, @SergeyKanzhelev)

    • The storageos, quobyte and flocker storage volume plugins are deprecated and will be removed in a later release. (#101773, @Jiawei0227)

    • The deprecated flag --hard-pod-affinity-symmetric-weight and --scheduler-name have been removed from kube-scheduler. Use ComponentConfig instead to configure those parameters. (#102805, @ahg-g)

    • The feature Dynamic Kubelet Configuration is deprecated and kubelet will report warning when the flag --dynamic-config-dir is used. Feature gate DynamicKubeletConfig is disabled out of the box and needs to be explicitly enabled. (#102966, @SergeyKanzhelev) [SIG Cloud Provider, Instrumentation and Node]

    • The in-tree azure and gcp auth plugins have been deprecated. The https://github.com/Azure/kubelogin and gcloud commands serve as out-of-tree replacements via the kubectl/client-go credential plugin mechanism. (#102181, @enj) [SIG API Machinery and Auth]

    • The ingress v1beta1 has been deprecated. (#102030, @aojea)

    API Change

    • A new score extension for NodeResourcesFit plugin that merges the functionality of NodeResourcesLeastAllocated, NodeResourcesMostAllocated, RequestedToCapacityRatio plugins, which are marked as deprecated as of v1beta2. In v1beta1, the three plugins can still be used in v1beta1 but not at the same time with the score extension of NodeResourcesFit. (#101822, @yuzhiquan)

    • A value of Auto is now a valid for the service.kubernetes.io/topology-aware-hints annotation. (#100728, @robscott)

    • Add DataSourceRef alpha field to PVC spec, which allows contents other than PVCs and VolumeSnapshots to be data sources. (#103276, @bswartz)

    • Add PersistentVolumeClaimDeletePoilcy to StatefulSet API. (#99378, @mattcary)

    • Add a new Priority and Fairness rule that exempts all probes (/readyz, /healthz, /livez) to prevent restarting of healthy kube-apiserver instance by kubelet. (#100678, @tkashem)

    • Add alpha support for HostProcess containers on Windows (#99576, @marosset) [SIG API Machinery, Apps, Node, Testing and Windows]

    • Add distributed tracing to the kube-apiserver. It is can be enabled with the feature gate APIServerTracing (#94942, @dashpole)

    • Add three metrics to the job controller to monitor if a job works in healthy condition. IndexedJob has been promoted to Beta. (#101292, @AliceZhang2016)

    • Added field .status.uncountedTerminatedPods to the Job resource. This field is used by the job controller to keep track of finished pods before adding them to the Job status counters. Pods created by the job controller get the finalizer batch.kubernetes.io/job-tracking Jobs that are tracked using this mechanism get the annotation batch.kubernetes.io/job-tracking. This is a temporary measure. Two releases after this feature graduates to beta, the annotation won’t be added to Jobs anymore. (#98817, @alculquicondor)

    • Added new kubelet alpha feature SeccompDefault. This feature enables falling back to the RuntimeDefault (former runtime/default) seccomp profile if nothing else is specified in the pod/container SecurityContext or the pod annotation level. To use the feature, enable the feature gate as well as set the kubelet configuration option SeccompDefault (--seccomp-default) to true. (#101943, @saschagrunert) [SIG Node]

    • Adds the ReadWriteOncePod access mode for PersistentVolumes and PersistentVolumeClaims. Restricts volume access to a single pod on a single node. (#102028, @chrishenzie)

    • Alpha swap support can now be enabled on Kubernetes nodes with the NodeSwapEnabled feature flag. See KEP-2400 for details. (#102823, @ehashman)

    • Because of the implementation logic of time.Format in golang, the displayed time zone is not consistent. (#102366, @cndoit18)

    • Corrected the documentation for escaping dollar signs in a container’s env, command and args property. (#101916, @MartinKanters) [SIG Apps]

    • Enable MaxSurge for DaemonSet by default. (#101742, @ravisantoshgudimetla)

    • Enforce the ReadWriteOncePod PVC access mode during scheduling (#103082, @chrishenzie)

    • Ephemeral containers are now allowed to configure a securityContext that differs from that of the Pod. Cluster administrators should ensure that security policy controllers support EphemeralContainers before enabling this feature in clusters. (#99023, @verb)

    • Exec plugin authors can override default handling of standard input via new interactiveMode kubeconfig field. (#99310, @ankeesler)

    • If someone had the ProbeTerminationGracePeriod alpha feature enabled in 1.21, they should update/delete any workloads/pods with probe terminationGracePeriods < 1 before upgrading (#103245, @wzshiming)

    • Improved parsing of label selectors (#102188, @alculquicondor) [SIG API Machinery]

    • Introduce minReadySeconds api to the StatefulSets. (#100842, @ravisantoshgudimetla)

    • Introducing Memory quality of service support with cgroups v2 (Alpha). The MemoryQoS feature is now in Alpha. This allows kubelet running with cgroups v2 to set memory QoS at container, pod and QoS level to protect and guarantee better memory quality. This feature can be enabled through feature gate Memory QoS. (#102970, @borgerli)

    • Kube API server accepts Impersonate-Uid header to impersonate a user with a specific UID, in the same way that you can currently use Impersonate-User, Impersonate-Group and Impersonate-Extra. (#99961, @margocrawf)

    • Kube-apiserver: --service-account-issuer can be specified multiple times now, to enable non-disruptive change of issuer. (#101155, @zshihang) [SIG API Machinery, Auth, Node and Testing]

    • Kube-controller-manager: the --horizontal-pod-autoscaler-use-rest-clients flag and Heapster support in the horizontal pod autoscaler, deprecated since 1.12, is removed. (#90368, @serathius)

    • Kube-scheduler: a plugin enabled in a v1beta2 configuration file takes precedence over the default configuration for that plugin. This simplifies enabling default plugins with custom configuration without needing to explicitly disable those default plugins. (#99582, @chendave)

    • New node-high priority-level has been added to Suggested API Priority and Fairness configuration.(#101151, @mborsz)

    • NodeSwapEnabled feature flag was renamed to NodeSwap

      The flag was only available in the 1.22.0-beta.1 release, and the new flag should be used going forward. (#103553, @ehashman) [SIG Node]

    • Omit comparison with boolean constant (#101523, @chuntaochen) [SIG CLI and Cloud Provider]

    • Removed the feature flag for probe-level termination grace period from Kubelet. If a user wants to disable this feature on already created pods, they will have to delete and recreate the pods. (#103168, @raisaat) [SIG Apps and Node]

    • Revert addition of Add PersistentVolumeClaimDeletePoilcy to StatefulSetAPI. (#103747, @mattcary)

    • Scheduler could be configured to consider new resources beside CPU and memory, GPU for example, for the score plugin of NodeResourcesBalancedAllocation. (#101946, @chendave) [SIG Scheduling]

    • Server Side Apply now treats all Selector fields as atomic (meaning the entire selector is managed by a single writer and updated together), since they contain interrelated and inseparable fields that do not merge in intuitive ways. (#97989, @Danil-Grigorev) [SIG API Machinery]

    • Suspend Job feature graduated to beta. Added the action label to Job controller sync metrics job_sync_total and job_sync_duration_seconds. (#102022, @adtac)

    • The API documentation for the DaemonSet’s spec.updateStrategy.rollingUpdate.maxUnavailable field was corrected to state that the value is rounded up. (#101296, @Miciah)

    • The CSIServiceAccountToken graduates to Ga and is unconditionally enabled. (#103001, @zshihang)

    • The CertificateSigningRequest.certificates.k8s.io API supports an optional expirationSeconds field to allow the client to request a particular duration for the issued certificate. The default signer implementations provided by the Kubernetes controller manager will honor this field as long as it does not exceed the –cluster-signing-duration flag. (#99494, @enj)

    • The EndpointSlicen Mirroring controller no longer mirrors the last-applied-configuration annotation created by kubectl to update EndpointSlices. (#102731, @sharmarajdaksh)

    • The NetworkPolicyEndPort is graduated to beta and is enabled by default. (#102834, @rikatz)

    • The PodDeletionCost feature has been promoted to beta, and enabled by default. (#101080, @ahg-g)

    • The Server Side Apply treats certain structs as atomic. Meaning the entire selector field is managed by a single writer and updated together. (#100684, @Jefftree)

    • The ServiceAppProtocol feature gate has been removed. It reached GA in Kubernetes (#103190, @robscott)

    • The TerminationGracePeriodSeconds on pod specs and container probes should not be negative. Negative values of TerminationGracePeriodSeconds will be treated as the value 1s on the delete path. Immutable field validation will be relaxed in order to update negative values. In a future release, negative values will not be permitted. (#98866, @wzshiming)

    • The kube-scheduler component config v1beta2 API available Three scheduler plugins deprecated (NodeLabel, ServiceAffinity, NodePreferAvoidPods). (#99597, @adtac)

    • The pod/eviction subresource now accepts policy/v1 eviction requests in addition to policy/v1beta1 eviction requests (#100724, @liggitt)

    • The podAffinity, NamespaceSelector and the associated CrossNamespaceAffinity quota scope features graduate to Beta and they are now enabled by default. (#101496, @ahg-g)

    • The pods/ephemeralcontainers API now returns and expects a Pod object instead of EphemeralContainers. This is incompatible with the previous alpha-level API. (#101034, @verb) [SIG Apps, Auth, CLI and Testing]

    • The v1.Node and .status.images[].names are now optional. (#102159, @roycaihw)

    • The deprecated flag --algorithm-provider has been removed from kube-scheduler. Use instead ComponentConfig to configure the set of enabled plugins. (#102239, @Haleygo)

    • The options --ssh-user and --ssh-key are removed. They only functioned on GCE, and only in-tree. Use the apiserver network proxy instead. (#102297, @deads2k)

    • Track Job completion through status and Pod finalizers, removing dependency on Pod tombstones. (#98238, @alculquicondor) [SIG API Machinery, Apps, Auth and Testing]

    • Track ownership of scale subresource for all scalable resources i.e. Deployment, ReplicaSet, StatefulSet, ReplicationController, and Custom Resources. (#98377, @nodo) [SIG API Machinery and Testing]

    Feature

    • Kube-apiserver: when merging lists, Server Side Apply now prefers the order of the submitted request instead of the existing persisted object (#107568, @jiahuif) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Storage and Testing]

    • Kubernetes is now built with Golang 1.16.12 (#106982, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Update golang.org/x/net to v0.0.0-20211209124913-491a49abca63 (#106960, @cpanato) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Node and Storage]

    • Kubernetes is now built with Golang 1.16.10 (#106223, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Update debian-base, debian-iptables, setcap images to pick up CVE fixes

      • Debian-base to v1.9.0
      • Debian-iptables to v1.6.7
      • setcap to v2.0.4 (#106143, @cpanato) [SIG Release and Testing]
    • A system-cluster-critical pod should not get a low OOM Score.

      As of now both system-node-critical and system-cluster-critical pods have -997 OOM score, making them one of the last processes to be OOMKilled. By definition system-cluster-critical pods can be scheduled elsewhere if there is a resource crunch on the node where as system-node-critical pods cannot be rescheduled. This was the reason for system-node-critical to have higher priority value than system-cluster-critical. This change allows only system-node-critical priority class to have low OOMScore.

      action required If the user wants to have the pod to be OOMKilled last and the pod has system-cluster-critical priority class, it has to be changed to system-node-critical priority class to preserve the existing behavior (#99729, @ravisantoshgudimetla)

    • API Server tracing can now trace re-entrant api requests. (#103218, @dashpole) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle and Instrumentation]

    • APIServerTracing now collects spans from etcd client calls, and propagates context to etcd. (#103216, @dashpole) [SIG API Machinery, Cloud Provider and Instrumentation]

    • APIServerTracing now collects spans from outgoing requests to admission webhooks. (#103601, @dashpole) [SIG API Machinery]

    • Add a namespace label for all apiserver_admission_* metrics. Expand the histogram range to 0-10s for all apiserver_admission_*_duration_seconds metrics. (#101208, @voutcn)

    • Add unified map on CRI to support cgroup v2. Refer to https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md#unified. (#102578, @payall4u)

    • Added BinaryData description to kubectl describe command. (#100568, @lauchokyip)

    • Added a new metric apiserver_flowcontrol_request_concurrency_in_use that shows the number of seats (concurrency) occupied by the currently executing requests in the API Priority and Fairness system. (#102795, @tkashem)

    • Added field-selector option for kubectl top pod (#102155, @lauchokyip) [SIG CLI]

    • Added new metrics about API Priority and Fairness. Each one has a label priority_level. The last two also have a label bound taking values min and `max.

      • apiserver_flowcontrol_current_r: R(the time of the last change in state of the queues)
      • apiserver_flowcontrol_dispatch_r: R(the time of the latest request dispatch)
      • apiserver_flowcontrol_latest_s: S(the request last dispatched) = R(when that request starts executing in the virtual world)
      • apiserver_flowcontrol_next_s_bounds: min and max next S among non-empty queues
      • apiserver_flowcontrol_next_discounted_s_bounds: min and max next S - (sum [over requests executing] width * estimatedDuration) among non-empty queues (#102859, @MikeSpreitzer) [SIG API Machinery and Instrumentation]
    • Adding --restart-kubelet flag on E2E Node test suite (#97028, @knabben) [SIG Node and Testing]

    • Adds feature gate KubeletInUserNamespace which enables support for running kubelet in a user namespace.

      The user namespace has to be created before running kubelet. All the node components such as CRI need to be running in the same user namespace.

      When the feature gate is enabled, kubelet ignores errors that happens during setting the following sysctl values: vm.overcommit_memory, vm.panic_on_oom, kernel.panic, kernel.panic_on_oops, kernel.keys.root_maxkeys, kernel.keys.root_maxbytes. (These sysctl values for the host, not for the containers)

      kubelet also ignores an error during opening /dev/kmsg. This feature gate also allows kube-proxy to ignore an error during setting RLIMIT_NOFILE.

      This feature gate is especially useful for running Kubernetes inside Rootless Docker/Podman with kind or minikube. (#92863, @AkihiroSuda) [SIG Network, Node and Testing]

    • Adds metrics for the delegated authenticator used by extension APIs that delegate authentication logic to the Kube API server. (#99364, @p0lyn0mial)

    • Adds metrics for the delegated authorizer used by extension APIs that delegate authorization logic to the Kube API server. (#100339, @p0lyn0mial)

    • Adds two kubemark flags, --max-pods and --extended-resources. (#100267, @Jeffwan)

    • An audit log entry will be generated when a ValidatingAdmissionWebhook is failing to open. (#92739, @cnphil)

    • Base images: Updated to

    • Base-images: Update to debian-base:buster-v1.7.1 (#102594, @mengjiao-liu)

    • Deprecated warning message for igonre-errors flag. (#102677, @yuzhiquan)

    • Endpoints that have more than 1000 endpoints will be truncated and the endpoints.kubernetes.io/over-capacity annotation on the Endpoints resource will be set to truncated. (#103520, @swetharepakula) [SIG Apps and Network]

    • Expose /debug/flags/v to allow dynamically setting log level for kube-proxy. (#98306, @borgerli) [SIG Network]

    • Expose container start time as container_start_time_seconds in the kubelet /metrics/resource endpoint. (#102444, @sanwishe)

    • Extended resources defined in LeastAllocated, MostAllocated and RequestedToCapacityRatio plugin argument are bypassed by the scheduler if the incoming Pod doesn’t request them in the pod spec. (#103169, @Huang-Wei)

    • Feat: change parittion style to GPT on Windows (#101412, @andyzhangx) [SIG Storage and Windows]

    • Features gates EndpointSliceProxying & WindowsEndpointSliceProxying graduates to GA and are unconditionally enabled. Kube-proxy will use EndpointSlices for endpoint information. (#103451, @swetharepakula)

    • Fluentd: isolate logging resources in separate namespace logging (#68004, @saravanan30erd)

    • For kubeadm: add --validity-period flag for kubeadm kubeconfig user command. (#100907, @SataQiu)

    • Implement minReadySeconds for the StatefulSets. (#101316, @ravisantoshgudimetla)

    • Improve logging of APIService availability changes in kube-apiserver. (#101420, @sttts)

    • Introduce a feature gate DisableCloudProviders allowing to disable cloud-provider initialization in KAPI, KCM and kubelet. DisableCloudProviders FeatureGate is currently in Alpha, which means is currently disabled by default. Once the FeatureGate moves to beta, in-tree cloud providers would be disabled by default, and a user won’t be able to specify --cloud-provider=<aws|openstack|azure|gcp|vsphere> anymore to any of KCM, KAPI or kubelet. Only a ‘–cloud-provider=external’ would be allowed. CCM would have to run out-of-tree with CSI. (#100136, @Danil-Grigorev)

    • JSON logging format is no longer available by default in non-core Kubernetes Components and require owners to opt in. (#102869, @mengjiao-liu) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Kube-apiserver: the alpha PodSecurity feature can be enabled by passing --feature-gates=PodSecurity=true, and enables controlling allowed pods using namespace labels. See https://git.k8s.io/enhancements/keps/sig-auth/2579-psp-replacement for more details. (#103099, @liggitt) [SIG API Machinery, Auth, Instrumentation, Release, Security and Testing]

    • Kube-proxy uses V1 EndpointSlices. (#103306, @swetharepakula)

    • Kubeadm: Add the RootlessControlPlane kubeadm specific feature gate (Alpha in 1.22, disabled by default). It can be used to enable an experimental feature that makes the control plane component static Pod containers for kube-apiserver, kube-controller-manager, kube-scheduler and etcd to run as a non-root users. (#102158, @vinayakankugoyal)

    • Kubeadm: Set the seccompProfile to runtime/default in the PodSecurityContext of the control-plane components that run as static Pods. (#100234, @vinayakankugoyal)

    • Kubeadm: add a new field skipPhases to v1beta3 InitConfiguration and JoinConfiguration that can contain a list of phases to skip during “kubeadm init” and “kubeadm join”. The flag “–skip-phases” takes precedence over this field. (#101923, @neolit123)

    • Kubeadm: add the --dry-run flag to the control-plane phase of “kubeadm init”. (#102722, @vinayakankugoyal)

    • Kubeadm: add the imagePullPolicy field in the nodeRegistration section of InitConfiguration and JoinConfiguration in v1beta3. This allows the user to specify the image pull policy during “kubeadm init” and “kubeadm join”. The value of this field must be one of Always, IfNotPresent or Never. The default behavior continues to be IfNotPresent. (#102901, @wangyysde)

    • Kubeadm: during “kubeadm init/join/upgrade”, always default the cgroupDriver value in the KubeletConfiguration to systemd, unless the user was explicit about the value. See configure-cgroup-driver for more details. (#102133, @pacoxu)

    • Kubeadm: update CoreDNS to 1.8.4. Grant CoreDNS permissions to “list” and “watch” EndpointSlice objects to accommodate dual-stack support. (#102466, @pacoxu)

    • Kubectl: add LAST RESTART column to kubectl get pods output. (#100142, @Ethyling)

    • Kubemark’s hollow-node will now print flags before starting. (#101181, @mm4tt)

    • Kubernetes is now built with Golang 1.16.3 (#101206, @justaugustus) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Kubernetes is now built with Golang 1.16.4 (#101809, @justaugustus) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Kubernetes is now built with Golang 1.16.5. (#102689, @cpanato)

    • Kubernetes is now built with Golang 1.16.6 (#103669, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Leader Migration for controller managers graduated to beta. (#103533, @jiahuif) [SIG API Machinery and Cloud Provider]

    • Make kubectl command headers default for beta. (#103238, @seans3) [SIG CLI]

    • Mark net.ipv4.ip_unprivileged_port_start as safe sysctl. (#103326, @pacoxu)

    • Metrics server nanny has now poll period set to 30s (previously 5 minutes) to allow faster scaling of metrics server. (#101869, @olagacek) [SIG Cloud Provider and Instrumentation]

    • NetworkPolicy validation framework support for windows. (#98077, @jayunit100)

    • New feature gate ExpandedDNSConfig is now available. This feature allows Kubernetes to have expanded DNS configuration. (#100651, @gjkim42)

    • New metrics: apiserver_kube_aggregator_x509_missing_san_total and apiserver_webhooks_x509_missing_san_total. This metric measures a number of connections to webhooks/aggregated API servers that use certificates without Subject Alternative Names. It being non-zero is a warning sign that these connections will stop functioning in the future since Golang is going to deprecate x509 certificate subject Common Names for server hostname verification. (#95396, @stlaz) [SIG API Machinery, Auth and Instrumentation]

    • Node Problem Detector is now available for GCE Windows nodes. (#101539, @jeremyje) [SIG Cloud Provider, Node and Windows]

    • Promote Cronjobs storage version to batch/v1. (#102363, @mengjiao-liu)

    • Promote CronJobControllerV2 flag to GA, with removal in 1.23. (#102529, @soltysh)

    • Promote EndpointSliceTerminatingCondition to Beta. This enables the terminating and serving conditions for EndpointSlice by default. (#103596, @andrewsykim)

    • Run etcd as non-root on GCE provider (#100635, @cindy52)

    • Scheduler nows provides an option for plugin developers to move Pods to activeQ. (#103383, @Huang-Wei)

    • Secret values are now masked by default in kubectl diff output. (#96084, @loozhengyuan)

    • Services with externalTrafficPolicy: Local now support graceful termination when using the iptables or ipvs mode of kube-proxy with EndpointSlices enabled. Specifically, if a connection for such a service arrives on a node when there are no “Ready” endpoints for the service, but there is at least one Terminating pod for that service on the node, then kube-proxy will send the traffic to the Terminating pod rather than dropping it. This patches up a race condition between when a pod is killed and when the external load balancer notices that it has been killed. (#97238, @andrewsykim)

    • Shell completion has been migrated to Cobra’s go solution. kubectl is now smarter about disabling file completion when it does not apply. Furthermore, completion for the cp command does not show all files unless the user has started typing something. (#96087, @marckhouzam) [SIG CLI]

    • Some of the in-tree storage drivers indicate support for the MetricsProvider interface, but fail to configure this for BlockMode volumes. With a recent change, Kubelet will call GetMetrics() for BlockMode volumes, and the in-tree drivers that miss the support cause a Go panic. Now the in-tree storage drivers that support BlockMode volumes, will return the Capacity of the volume in the GetMetrics() call. (#101587, @nixpanic)

    • Support FakeClientset match subresource. (#100939, @wzshiming)

    • The “Leader Migration” now support a wildcard component name and the default value. (#102711, @jiahuif)

    • The CSI driver supports the NodeServiceCapability VOLUME_MOUNT_GROUP and the DelegateFSGroupToCSIDriver feature gate is enabled, kubelet will delegate applying FSGroup to the driver by passing it to NodeStageVolume and NodePublishVolume, regardless of what other FSGroup policies are set, this is an alpha feature. (#103244, @verult)

    • The Memory Manager feature graduates to Beta and it is enabled by default. (#101947, @cynepco3hahue)

    • The BoundServiceAccountTokenVolume graduates to GA and thus will be unconditionally enabled. The feature gate is going to be removed in 1.23. (#101992, @zshihang)

    • The EmptyDir memory backed volumes are sized as the the minimum of pod allocatable memory on a host and an optional explicit user provided value. (#101048, @dims)

    • The HugePageStorageMediumSize feature graduates to GA and unconditionally enabled. Allowing unconditional usage of multiple sizes huge page resources on a container level. (#99144, @bart0sh)

    • The IngressClassNamespacedParams feature gate has graduated to beta and is enabled by default. This means IngressClass resource will now have two new fields - spec.paramters.namespace and spec.parameters.scope. (#101711, @hbagdi)

    • The LogarithmicScaleDown feature graduates to Beta and enabled by default. (#101767, @damemi)

    • The NamespaceDefaultLabelName is promoted to GA in this release. All Namespace API objects have a kubernetes.io/metadata.name label matching their metadata.name field to allow selecting any namespace by its name using a label selector. (#101342, @rosenhouse)

    • The ServiceInternalTrafficPolicy feature graduates to Beta and enable by default, which enables the internalTrafficPolicy field of Service by default. (#103462, @andrewsykim)

    • The ServiceLBNodePortControl graduates to Beta and is enabled by default. (#100412, @hanlins)

    • The SetHostnameAsFQDN graduates to GA and thus will be unconditionally disabled. (#101294, @javidiaz)

    • The WarningHeader feature is now GA and is unconditionally enabled. The apiserver_requested_deprecated_apis metric has graduated to stable status. The WarningHeader feature-gate is no longer operative and will be removed in v1.24. (#100754, @liggitt) [SIG API Machinery, Instrumentation and Testing]

    • The kubectl debug is able to create ephemeral containers in pre-1.22 clusters with the EphemeralContainers feature enabled. Note that versions of kubectl prior to 1.22 are unable to create ephemeral containers in clusters version 1.22 and greater due to an API change. (#103292, @verb)

    • The client-go credential plugins are now GA and are enabled by default. (#102890, @ankeesler)

    • The feature gate SSA graduated to GA in v1.22 and therefore is unconditionally enabled. (#100139, @Jefftree)

    • The job controller removes running pods when the number of completions is achieved. (#99963, @alculquicondor)

    • The kubeconfig is now exposed in the kube-scheduler framework handle. Out-of-tree plugins can leverage that to build CRD informers easily. (#100644, @Huang-Wei)

    • The new flag --chunk-size=SIZE for kubectl drain has been promoted to beta, and enabled by default. This flag may be used to alter the number of items or disable this feature when 0 is passed. (#100148, @KnVerey)

    • The new flag --chunk-size=SIZE has been added to kubectl describe. This flag may be used to alter the number of items or disable this feature when 0 is passed. (#101171, @KnVerey)

    • The pod resource API will provide memory manager metrics in the case when the memory manager feature gate is enabled, and the memory manager policy is static. (#101030, @cynepco3hahue)

    • The prefer nominated node graduates to Beta and enabld by default. (#102201, @chendave)

    • Update etcd version to 3.5.0-beta.3. (#102062, @serathius)

    • Update the Debian images to pick up CVE fixes in the base images:

      • Update the debian-base image to v1.7.0
      • Update the debian-iptables image to v1.6.1 (#102302, @xmudrii)
    • Update the setcap image to buster-v2.0.1. (#102377, @xmudrii)

    • Update the system-validators library to v1.5.0. Includes validation for seccomp and fixes a stdout/stderr problem in the Docker validator. (#103390, @ironyman)

    • Updates the following images to pick up CVE fixes:

      • debian to v1.8.0
      • debian-iptables to v1.6.5
      • setcap to v2.0.3 (#103235, @thejoycekung) [SIG API Machinery, Release and Testing]
    • Warnings for the use of deprecated and known-bad values in pod specs are now sent. (#101688, @liggitt)

    • Watch requests are now handled throttled by priority and fairness filter in kube-apiserver. (#102171, @wojtek-t)

    • You can use this Builder function to create events Field Selector (#101817, @cndoit18) [SIG API Machinery and Scalability]

    • Scheduler now registers event handlers dynamically. (#101394, @Huang-Wei)

    • kubectl: Enable using protocol buffers to request Metrics API. (#102039, @serathius)

    Documentation

    • The commandkubectl debug will now print a warning message when using the --target option since many container runtimes do not support this yet. (#101074, @verb)

    Failing Test

    • Fixes hostpath storage e2e tests within SELinux enabled env (#105786, @Elbehery) [SIG Testing]
    • Fixed generic ephemeal volumes with OwnerReferencesPermissionEnforcement admission plugin enabled. (#101186, @jsafrane)
    • Fixes kubectl drain --dry-run=server. (#100206, @KnVerey)
    • Fixes an overly restrictive conformance test to accept service account tokens signed by an ECDSA key (#100680, @smira) [SIG Architecture, Auth and Testing]
    • Fixes the should receive events on concurrent watches in same order conformance test to work properly on clusters that auto-create additional configmaps in namespaces. (#101950, @liggitt)
    • Resolves an issue with the “ServiceAccountIssuerDiscovery should support OIDC discovery” conformance test failing on clusters which are configured with issuers outside the cluster (#101589, @mtaufen) [SIG Auth and Testing]

    Other (Cleanup or Flake)

    • Updates konnectivity-network-proxy to v0.0.27. This includes a memory leak fix for the network proxy (#107187, @rata) [SIG API Machinery, Auth and Cloud Provider]

    Bug or Regression

    • An inefficient lock in EndpointSlice controller metrics cache has been reworked. Network programming latency may be significantly reduced in certain scenarios, especially in clusters with a large number of Services. (#107168, @robscott) [SIG Apps and Network]

    • Client-go: fix that paged list calls with ResourceVersionMatch set would fail once paging kicked in. (#107335, @fasaxc) [SIG API Machinery]

    • Fix a panic when using invalid output format in kubectl create secret command (#107346, @rikatz) [SIG CLI]

    • Fix: azuredisk parameter lowercase translation issue (#107429, @andyzhangx) [SIG Cloud Provider and Storage]

    • Fixes a rare race condition handling requests that timeout (#107459, @liggitt) [SIG API Machinery]

    • Mount-utils: Detect potential stale file handle (#107039, @andyzhangx) [SIG Storage]

    • A pod that the Kubelet rejects was still considered as being accepted for a brief period of time after rejection, which might cause some pods to be rejected briefly that could fit on the node. A pod that is still terminating (but has status indicating it has failed) may also still be consuming resources and so should also be considered. (#104918, @ehashman) [SIG Node]

    • Fix: skip instance not found when decoupling vmss from lb (#105836, @nilo19) [SIG Cloud Provider]

    • Kubeadm: allow the “certs check-expiration” command to not require the existence of the cluster CA key (ca.key file) when checking the expiration of managed certificates in kubeconfig files. (#106930, @neolit123) [SIG Cluster Lifecycle]

    • Kubeadm: during execution of the “check expiration” command, treat the etcd CA as external if there is a missing etcd CA key file (etcd/ca.key) and perform the proper validation on certificates signed by the etcd CA. Additionally, make sure that the CA for all entries in the output table is included - for both certificates on disk and in kubeconfig files. (#106925, @neolit123) [SIG Cluster Lifecycle]

    • Respect grace period when updating static pods. (#106394, @gjkim42) [SIG Node and Testing]

    • Reverts graceful node shutdown to match 1.21 behavior of setting pods that have not yet successfully completed to “Failed” phase if the GracefulNodeShutdown feature is enabled in kubelet. The GracefulNodeShutdown feature is beta and must be explicitly configured via kubelet config to be enabled in 1.21+. This changes 1.22 and 1.23 behavior on node shutdown to match 1.21. If you do not want pods to be marked terminated on node shutdown in 1.22 and 1.23, disable the GracefulNodeShutdown feature. (#106899, @bobbypage) [SIG Node]

    • Scheduler’s assumed pods have 2min instead of 30s to receive nodeName pod updates (#106633, @ahg-g) [SIG Scheduling]

    • EndpointSlice Mirroring controller now cleans up managed EndpointSlices when a Service selector is added (#106132, @robscott) [SIG Apps, Network and Testing]

    • Fix a bug that --disabled-metrics doesn’t function well. (#105793, @Huang-Wei) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Fix a panic in kubectl when creating secrets with an improper output type (#106356, @lauchokyip) [SIG CLI]

    • Fix concurrent map access causing panics when logging timed-out API calls. (#106112, @marseel) [SIG API Machinery]

    • Fix kube-proxy regression on UDP services because the logic to detect stale connections was not considering if the endpoint was ready. (#106239, @aojea) [SIG Network and Testing]

    • Fix scoring for NodeResourcesBalancedAllocation plugins when nodes have containers with no requests. (#106081, @ahmad-diaa) [SIG Scheduling]

    • Support more than 100 disk mounts on Windows (#105673, @andyzhangx) [SIG Storage and Windows]

    • The –leader-elect* CLI args are now honored correctly in scheduler. (#106130, @Huang-Wei) [SIG Scheduling]

    • The kube-proxy sync_proxy_rules_iptables_total metric now gives the correct number of rules, rather than being off by one.

      Fixed multiple iptables proxy regressions introduced in 1.22:

      • When using Services with SessionAffinity, client affinity for an endpoint now gets broken when that endpoint becomes non-ready (rather than continuing until the endpoint is fully deleted).

      • Traffic to a service IP now starts getting rejected (as opposed to merely dropped) as soon as there are no longer any usable endpoints, rather than waiting until all of the terminating endpoints have terminated even when those terminating endpoints were not being used.

      • Chains for endpoints that won’t be used are no longer output to iptables, saving a bit of memory/time/cpu. (#106373, @aojea) [SIG Network]

    • Watch requests that are delegated to aggregated apiservers no longer reserve concurrency units (seats) in the API Priority and Fairness dispatcher for their entire duration. (#105827, @benluddy) [SIG API Machinery]

    • Fix Job tracking with finalizers for more than 500 pods, ensuring all finalizers are removed before counting the Pod. (#104876, @alculquicondor) [SIG Apps]

    • Fix: skip case sensitivity when checking Azure NSG rules fix: ensure InstanceShutdownByProviderID return false for creating Azure VMs (#104446, @feiskyer) [SIG Cloud Provider]

    • Fixed occasional pod cgroup freeze when using cgroup v1 and systemd driver. (#104529, @kolyshkin) [SIG Node]

    • Fixes a regression that could cause panics in LRU caches in controller-manager, kubelet, kube-apiserver, or client-go EventSourceObjectSpamFilter (#104469, @liggitt) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation and Storage]

    • When using kubectl replace (or the equivalent API call) on a Service, the caller no longer needs to do a read-modify-write cycle to fetch the allocated values for .spec.clusterIP and .spec.ports[].nodePort. Instead the API server will automatically carry these forward from the original object when the new object does not specify them. (#104672, @thockin) [SIG Network]

    • Fix kube-apiserver metric reporting for the deprecated watch path of /api//watch/… (#104188, @wojtek-t) [SIG API Machinery and Instrumentation]

    • Kube-proxy: delete stale conntrack UDP entries for loadbalancer ingress IP. (#104009, @aojea) [SIG Network]

    • Pass additional flags to subpath mount to avoid flakes in certain conditions (#104346, @mauriciopoppe) [SIG Storage]

    • Added jitter factor to lease controller that better smears load on kube-apiserver over time. (#101652, @marseel) [SIG API Machinery and Scalability]

    • Added privileges for EndpointSlice to the default view & edit RBAC roles. (#101203, @mtougeron)

    • After DBus restarts, make GracefulNodeShutdown work again (#100369, @wzshiming)

    • Aggregate errors when putting vmss. (#98350, @nilo19)

    • Aggregate write permissions on events to users with edit and admin role. (#102858, @tumido)

    • Aggregated roles no longer include write access to EndpointSlices. This rolls back part of a change that was introduced earlier in the Kubernetes 1.22 cycle. (#103703, @robscott)

    • Applying fix for not deleting existing public IP when a service is deleted in Azure. (#100694, @nilo19)

    • Applying fix for not tagging static public IP. (#101752, @nilo19)

    • Applying fix so that deleting non-existing disk returns success. (#102083, @andyzhangx)

    • Applying fix: cleanup outdated routes. (#102935, @nilo19)

    • Avoid caching the Azure VMSS instances whose network profile is nil (#100948, @feiskyer) [SIG Cloud Provider]

    • Azure: Avoid setting cached Sku when updating VMSS and VMSS instances. (#102005, @feiskyer)

    • Azurefile: Normalize share name to not include the capital letters (#100731, @kassarl)

    • Chain the field manager creation calls in newDefaultFieldManager to be explicit about the order of operations. (#101076, @kevindelgado)

    • Disruption controller shouldn’t error while syncing for unmanaged pods. (#103414, @ravisantoshgudimetla) [SIG Apps and Testing]

    • Ensure service is deleted when the Azure resource group has been deleted. (#100944, @feiskyer)

    • Ensures ExecProbeTimeout=false kubelet feature gate with dockershim is taken into account, when the exec probe takes longer than timeoutSeconds configuration. (#100200, @jackfrancis)

    • Expose rest_client_rate_limiter_duration_seconds metric to component-base to track client side rate limiter latency in seconds. Broken down by verb and URL. (#100311, @IonutBajescu) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Fire an event when failing to open NodePort. (#100599, @masap)

    • Fix Azure node public IP fetching issues from instance metadata service when the node is part of standard load balancer backend pool. (#100690, @feiskyer) [SIG Cloud Provider]

    • Fix EndpointSlice describe panic when an Endpoint doesn’t have zone. (#101025, @tnqn)

    • Fix kubectl set env or resources not working for initcontainers. (#101669, @carlory)

    • Fix kubectl alpha debug node does not work on tainted(NoExecute) nodes and tolerate everything. (#98431, @wawa0210)

    • Fix a bug on the endpointslicemirroring controller where endpoint NotReadyAddresses were mirrored as Ready to the corresponding EndpointSlice. (#102683, @aojea)

    • Fix a bug that a preemptor pod may exist as a phantom in the scheduler. (#102498, @Huang-Wei)

    • Fix a number of race conditions in the kubelet when pods are starting up or shutting down that might cause pods to take a long time to shut down. (#102344, @smarterclayton) [SIG Apps, Node, Storage and Testing]

    • Fix an issue with kubectl on certain older version of Windows or when legacy console mode is enabled on Windows 8 which causes kubectl exec to crash. (#102825, @n4j)

    • Fix availability set cache in vmss cache (#100110, @CecileRobertMichon) [SIG Cloud Provider]

    • Fix how nulls are handled in array and objects in json patches. (#102467, @pacoxu)

    • Fix panic when kubectl create ingress has annotation flag and an empty value set. (#101377, @rikatz)

    • Fix performance regression for update and apply operations on large CRDs. (#103318, @jpbetz) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation and Storage]

    • Fix raw block mode CSI NodePublishVolume stage miss pod info. (#99069, @phantooom)

    • Fix resource enforcement when using systemd cgroup driver (#102147, @kolyshkin)

    • Fix rounding of volume storage requests. (#100100, @maxlaverse)

    • Fix runtime container status for PostStart hook error. (#100608, @pacoxu)

    • Fix scoring for NodeResourcesMostAllocated and NodeResourcesBalancedAllocation plugins when nodes have containers with no requests. This was leaving to under-utilization of small nodes. (#102925, @alculquicondor)

    • Fix the code is leaking the defaulting between unrelated pod instances. (#103284, @kebe7jun) [SIG CLI]

    • Fix winkernel kube-proxy to only use dual stack when host and networking supports it (#101047, @jsturtevant) [SIG Network and Windows]

    • Fix: Azure file inline volume namespace issue in CSI migration translation (#101235, @andyzhangx)

    • Fix: Bug in kube-proxy latency metrics to calculate only the latency value for the Endpoints that are created after it starts running. This is needed because all the Endpoints objects are processed on restarts, independently when they were. (#100861, @aojea)

    • Fix: avoid nil-pointer panic when checking the frontend IP configuration (#101739, @nilo19) [SIG Cloud Provider]

    • Fix: display of Job completion mode in kubectl describe. (#101160, @alculquicondor)

    • Fix: return empty VMAS name if using standalone VM (#103470, @nilo19) [SIG Cloud Provider]

    • Fix: set “host is down” as corrupted mount. When SMB server is down, there is no way to terminate pod which is using SMB mount, would get an error. (#101398, @andyzhangx)

    • Fix: using NVMe AWS EBS volumes partitions. (#100500, @jsafrane)

    • Fixed ‘kubelet’ runtime panic for timed-out portforward streams. (#102489, @saschagrunert)

    • Fixed SELinux relabeling of CSI volumes after CSI driver failure. (#103154, @jsafrane) [SIG Node and Storage]

    • Fixed garbage collection of dangling VolumeAttachments for PersistentVolumes migrated to CSI on startup of kube-controller-manager. (#102176, @timebertt)

    • Fixed port-forward memory leak for long-running and heavily used connections. (#99839, @saschagrunert)

    • Fixed a bug due to which the controller was not populating the lastSuccessfulTime field added to cronjob.status in batch/v1. (#102642, @alaypatel07)

    • Fixed a bug that kubectl create configmap always returns zero exit code when failed. (#101780, @nak3) [SIG CLI]

    • Fixed a bug that scheduler extenders are not called on preemptions. (#103019, @ordovicia)

    • Fixed a bug where startupProbe stopped working after a container’s first restart. (#101093, @wzshiming)

    • Fixed an issue blocking azure auth to prompt to device code authentication flow when refresh token expires. (#102063, @tdihp)

    • Fixed false-positive uncertain volume attachments, which led to unexpected detachment of CSI migrated volumes (#101737, @Jiawei0227) [SIG Apps and Storage]

    • Fixed mounting of NFS volumes when IPv6 address is used as a server. (#101067, @Elbehery) [SIG Storage]

    • Fixed starting new pods after previous pod timed out unmounting its volumes. (#100183, @jsafrane)

    • Fixed very rare volume corruption when a pod is deleted while kubelet is offline. (#102059, @jsafrane)

    • Fixes a data race issue in the priority and fairness API server filter. (#100638, @tkashem)

    • Fixes issue with websocket-based watches of Service objects not closing correctly on timeout. (#102539, @liggitt)

    • For kubeadm: support for custom imagetags for etcd images which contain build metadata, when imagetags are in the form of version_metadata. For instance, if the etcd version is v3.4.13+patch.0, the supported imagetag would be v3.4.13_patch.0 (#100350, @jr0d)

    • For vSphere: fix regression during attach disk if datastore is within a storage folder or datastore cluster. (#102892, @gnufied)

    • GCE Windows clusters have their TCP/IP parameters are set to GCE’s recommended values. (#103057, @jeremyje) [SIG Cloud Provider and Windows]

    • GCE Windows will no longer install Docker on containerd nodes. (#101747, @jeremyje) [SIG Cloud Provider and Windows]

    • Generated OpenAPI now correctly specifies 201 as a possible response code for PATCH operations. (#100141, @brendandburns)

    • Graceful termination will now be honored when deleting a collection of pods. (#100101, @deads2k)

    • If kube-proxy mode is userspace do not enable EndpointSlices. (#100913, @JornShen)

    • Kubeadm: allow passing the flag --log-file if --config is passed. If you wish to log to a file you must also pass --logtostderr=false or --alsologtostderr=true. Alternatively you can pipe to a file using “kubeadm … | tee …”. (#101449, @CaoDonghui123)

    • Kubeadm: enable --experimental-patches flag for kubeadm join phase control-plane-join all command. (#101110, @SataQiu)

    • Kubeadm: fix a bug where kubeadm join for control plane nodes would download certificates and keys from the cluster, but would not write publicly readable certificates and public keys with mode 0644 and instead use mode 0600. (#103313, @neolit123)

    • Kubeadm: fix the bug that kubeadm only uses the first hash in caCertHashes to verify the root CA. (#101977, @SataQiu)

    • Kubeadm: remove the “ephemeral_storage” request from the etcd static pod that kubeadm deploys on stacked etcd control plane nodes. This request has caused sporadic failures on some setups due to a problem in the kubelet with cadvisor and the LocalStorageCapacityIsolation feature gate. See this issue for more details: https://github.com/kubernetes/kubernetes/issues/99305 (#102673, @jackfrancis) [SIG Cluster Lifecycle]

    • Kubeadm: when using a custom image repository for CoreDNS kubeadm now will append the coredns image name instead of coredns/coredns, thus restoring the behaviour existing before the v1.21 release. Users who rely on nested folder for the coredns image should set the clusterConfiguration.dns.imageRepository value including the nested path name (e.g using registry.company.xyz/coredns will force kubeadm to use registry.company.xyz/coredns/coredns image). No action is needed if using the default registry (k8s.gcr.io). (#102502, @ykakarap)

    • Kubelet: improve the performance when waiting for a synchronization of the node list with the kube-apiserver. (#99336, @neolit123)

    • Kubelet: the returned value for PodIPs is the same in the Downward API and in the pod.status.PodIPs field (#103307, @aojea)

    • Limit vSphere volume name to 63 characters long. (#100404, @gnufied)

    • Logging for GCE Windows clusters will be more accurate and complete when using Fluent bit. (#101271, @jeremyje)

    • Metrics Server will use Addon Manager 1.8.3 (#103541, @jbartosik) [SIG Cloud Provider and Instrumentation]

    • Output for kubectl describe podsecuritypolicy is now kind specific and cleaner (#101436, @KnVerey)

    • Parsing of cpuset information now properly detects more invalid input such as 1--3 or 10-6. (#100565, @lack)

    • Pods that are known to the kubelet to have previously been Running should not revert to Pending state, the kubelet will now infer a termination. (#102821, @ehashman)

    • Prevent Kubelet stuck in DiskPressure when imagefs.minReclaim is set (#99095, @maxlaverse)

    • Reduces delay initializing on non-AWS platforms docker runtime. (#93260, @nckturner) [SIG Cloud Provider]

    • Register/Deregister Targets in chunks for AWS TargetGroup (#101592, @M00nF1sh) [SIG Cloud Provider]

    • Removed /sbin/apparmor_parser requirement for the AppArmor host validation. This allows using AppArmor on distributions which ship the binary in a different path. (#97968, @saschagrunert) [SIG Node and Testing]

    • Renames the timeout field for the DelegatingAuthenticationOptions to TokenRequestTimeout and set the timeout only for the token review client. Previously the timeout was also applied to watches making them reconnecting every 10 seconds. (#100959, @p0lyn0mial)

    • Reorganized iptables rules to reduce rules in KUBE-SERVICES and KUBE-NODEPORTS. (#96959, @tssurya)

    • Respect annotation size limit for server-side apply updates to the client-side apply annotation. Also, fix opt-out of this behavior by setting the client-side apply annotation to the empty string. (#102105, @julianvmodesto) [SIG API Machinery]

    • Retry FibreChannel devices cleanup after error to ensure FibreChannel device is detached before it can be used on another node. (#101862, @jsafrane)

    • Support correct sorting for cpu, memory, storage, ephemeral-storage, hugepages, and attachable-volumes. (#100435, @lauchokyip)

    • Switch scheduler to generate the merge patch on pod status instead of the full pod (#103133, @marwanad) [SIG Scheduling]

    • The EndpointSlice IP validation now matches Endpoints IP validation. (#101084, @robscott)

    • The kube-apiserver now reports the synthetic verb when logging requests, better explaining the user intent and matching what is reported in the metrics. (#102934, @lavalamp)

    • The kube-controller-manager' sets the upper-bound timeout limit for outgoing requests to 70s. Previously (#99358, @p0lyn0mial)

    • The kube-proxy log now shows the “Skipping topology aware endpoint filtering since no hints were provided for zone” warning under the right conditions. (#101857, @dervoeti)

    • The kubectl create service now respects the namespace flag. (#101005, @zxh326)

    • The kubectl get now truncates multi-line strings to avoid breaking printing (#103514, @soltysh)

    • The kubectl wait --for=delete command now ignores the not found error correctly. (#96702, @lingsamuel)

    • The kubelet now reports distinguishes log messages about certificate rotation for its client cert and server cert separately to make debugging problems with one or the other easier. (#101252, @smarterclayton)

    • The serviceOwnsFrontendIP shouldn’t report error when the public IP doesn’t match. (#102516, @nilo19)

    • The system:aggregate-to-edit role no longer includes write access to the Endpoints API. For new Kubernetes 1.22 clusters, the edit and admin roles will no longer include that access in newly created Kubernetes 1.22 clusters. This will have no affect on existing clusters upgrading to Kubernetes 1.22. To retain write access to Endpoints in the aggregated edit and admin roles for newly created 1.22 clusters, refer to https://github.com/kubernetes/website/pull/29025. (#103704, @robscott) [SIG Auth and Network]

    • The conformance tests:

      • Services should serve multiport endpoints from pods
      • Services should serve a basic endpoint from pods were only validating the API objects, not performing any validation on the actual Services implementation. Those tests now validate that the Services under test are able to forward traffic to the endpoints. (#101709, @aojea) [SIG Network and Testing]
    • The current behavior for Services that IPFamilyPolicy set as PreferDualstack. The current behavior when the cluster is upgraded to dual-stack is:

      • Services that have been set to IPFamilyPolicy = PreferDualstack will be upgraded when the service object is updated. e.g., when a user change a label.

      This behavior will change to:

      • Services that have been set IPFamilyPolicy = PreferDualstack will not be upgraded when the service object is updated. User can still change policy, type etc and existing behaviors remain the same. (#102898, @khenidak) [SIG Network and Testing]
    • The reason and message fields for pod status are no longer reset unless the phase also changes. (#103785, @smarterclayton) [SIG Node]

    • Treat VSphere “File (vmdk path here) was not found” errors as success during volume deletion (#92372, @breunigs) [SIG Cloud Provider and Storage]

    • Update kube-proxy base image debian-iptables to v1.6.2 to pickup documentation \n"- debian-iptables: select nft mode if ntf lines > legacy lines, matching iptables-wrappers" (#102590, @BenTheElder)

    • Update klog v2.9.0. (#102332, @pacoxu)

    • Updated the Graceful Node Shutdown Pod termination reason and message. Updated the Graceful Node Shutdown Pod rejection reason and message. (#102840, @Kissy)

    • Updates dependency sigs.k8s.io/structured-merge-diff to v4.1.1. (#100784, @kevindelgado)

    • Updates hostprocess tests to specify user. (#102965, @jsturtevant)

    • Upgrades functionality of kubectl kustomize as described at https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv4.2.0 (#103419, @natasha41575) [SIG CLI]

    • Upgrades functionality of kubectl kustomize as described at kustomize/v4.1.2 (#101120, @monopole)

    • Upgrading etcd: kubeadm upgrade etcd to 3.4.13-3 (#100612, @pacoxu)

    • Use default timeout of 10s for Azure ACR credential provider. (#100686, @hasheddan) [SIG Cloud Provider]

    • We no longer allow the cluster operator to delete any suggested priority & fairness bootstrap configuration object. If a cluster operator removes a suggested configuration, it will be restored by the apiserver. (#102067, @tkashem)

    • When DisableAcceleratorUsageMetrics is set, do not collect accelerator metrics using cAdvisor. (#101712, @SergeyKanzhelev) [SIG Instrumentation and Node]

    • YAML documents separators ("—") can now be followed by whitespace and comments ("# ….") on the same line. This fixes a bug where documents starting with a comment after the separator were ignored. Other types of content on the same line will result in an error. (#103457, @codearky) [SIG API Machinery]

    • oc describe quota used has the same unit format as hard (#102177, @atiratree) [SIG CLI]

    Other (Cleanup or Flake)

    • Kube-apiserver: sets an upper-bound on the lifetime of idle keep-alive connections and time to read the headers of incoming requests (#103958, @liggitt) [SIG API Machinery and Node]
    • After the deprecation period,now the Kubelet’s --chaos-chance flag are removed. (#101057, @wangyysde) [SIG Node]
    • Allow CSI drivers to just run offline expansion tests. (#102665, @gnufied)
    • Changed buildmode of non static Kubernetes binaries to produce position independent executables (PIE). (#102323, @saschagrunert)
    • Clarified the description of a test in the e2e suite that mentions “SCTP” but is actually intended to be testing the behavior of network plugins that don’t implement SCTP. (#102509, @danwinship)
    • Client-go: reduce verbosity of Starting/Stopping reflector messages to 3 again. (#102788, @pohly)
    • Disable log sampling when using json logging format. (#102620, @serathius)
    • Exposes WithCustomRoundTripper method for specifying a middleware function for custom HTTP behaviour for the delegated auth clients. (#99775, @p0lyn0mial)
    • Fake clients now implement a FakeClient interface (#100940, @markusthoemmes) [SIG API Machinery and Instrumentation]
    • Featuregate ServiceLoadBalancerClass graduates to Beta and is enables by default. (#103129, @XudongLiuHarold)
    • Improve func ToSelectableFields’ performance for event. (#102461, @goodluckbot)
    • Increased CSINodeIDMaxLength from 128 bytes to 192 bytes. Prepare to increase the length limit to 256 bytes in 1.23 release. (#101256, @Jiawei0227)
    • JSON logging now supports having information about source code location in the logging format, source code information is available under the key “caller”. (#102437, @MadhavJivrajani)
    • Kubeadm: move the BootstrapToken* API and related utilities from v1beta3 to a separate API group/version - bootstraptoken/v1. (#102964, @neolit123) [SIG Cluster Lifecycle]
    • Kubeadm: the CriticalAddonsOnly toleration has been removed from kube-proxy DaemonSet (#101966, @SataQiu) [SIG Cluster Lifecycle]
    • Metrics Server updated to use 0.4.4 image that doesn’t depend on deprecated authorization.k8s.io/v1beta1 subjectaccessreviews API version. (#101477, @x13n)
    • Migrate proxy/ipvs/proxier.go logs to structured logging. (#97796, @JornShen)
    • Migrate staging/src/k8s.io/apiserver/pkg/registry logs to structured logging. (#98287, @lala123912)
    • Migrate some log messages to structured logging in pkg/volume/plugins.go. (#101510, @huchengze)
    • Migrate some log messages to structured logging in pkg/volume/volume_linux.go. (#99566, @huchengze)
    • Official binaries now include the golang generated build ID buildid instead of an empty string. (#101411, @saschagrunert)
    • Remove balanced attached node volumes feature. (#102443, @ravisantoshgudimetla)
    • Remove deprecated --generator flag from kubectl autoscale. (#99900, @MadhavJivrajani)
    • Remove the deprecated flag --generator from kubectl create deployment command. (#99915, @BLasan)
    • Remove the duplicate packet import. (#101187, @chuntaochen)
    • Replace go-bindata with //go:embed. (#99829, @palnabarun)
    • The DynamicFakeClient now exposes its tracker via a Tracker() function. (#100085, @markusthoemmes)
    • The VolumeSnapshotDataSource feature gate that is GA since v1.20 is unconditionally enabled, and can no longer be specified via the --feature-gates argument. (#101531, @ialidzhikov) [SIG Storage]
    • The deprecated CRIContainerLogRotation feature-gate has been removed, since the CRIContainerLogRotation feature graduated to GA in 1.21 and was unconditionally enabled. (#101578, @carlory)
    • The deprecated RootCAConfigMap feature-gate has been removed, since the RootCAConfigMap feature graduated to GA in 1.21 and is unconditionally enabled. (#101579, @carlory)
    • The deprecated runAsGroup feature-gate has been removed, since the runAsGroup feature graduated to GA in 1.21. (#101581, @carlory)
    • The etcd client has been updated to 3.5.0; github.com/golang/protobuf, google.golang.org/protobuf, and google.golang.org/grpc have been updated to current versions. (#100488, @liggitt)
    • Update Azure Go SDK to v55.0.0. (#102441, @feiskyer)
    • Update Azure Go SDK version to v53.1.0 (#101357, @feiskyer) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle and Instrumentation]
    • Update CNI plugins to v0.9.1. (#102328, @lentzi90)
    • Update Calico to v3.19.1. (#102386, @JornShen)
    • Update cri-tools dependency to v1.21.0. (#100956, @saschagrunert)
    • Update dep google/gnostic and google/go-cmp to v0.5.5 and updating transitive dependencies protobuf. (#102783, @mcbenjemaa)
    • Update golang.org/x/net to v0.0.0-20210520170846-37e1c6afe023 (#103176, @CaoDonghui123) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Node and Storage]
    • Updated command descriptions and examples for grammar and punctuation consistency. (#103524, @bergerhoffer) [SIG Auth and CLI]
    • Updated pause image to version 3.5, which now runs per default as pseudo user and group 65535:65535. This does not have any effect on remote container runtimes like CRI-O and containerd, which setup the pod sandbox user and group on their own. (#100292, @saschagrunert)
    • Upgrade functionality of kubectl kustomize as described at kustomize/v4.1.3. (#102193, @gautierdelorme)

    Dependencies

    Added

    • github.com/antihax/optional: v1.0.0
    • github.com/benbjohnson/clock: v1.0.3
    • github.com/bits-and-blooms/bitset: v1.2.0
    • github.com/certifi/gocertifi: 2c3bb06
    • github.com/checkpoint-restore/go-criu/v5: v5.0.0
    • github.com/cncf/udpa/go: 5459f2c
    • github.com/cockroachdb/errors: v1.2.4
    • github.com/cockroachdb/logtags: eb05cc2
    • github.com/coredns/caddy: v1.1.0
    • github.com/felixge/httpsnoop: v1.0.1
    • github.com/frankban/quicktest: v1.11.3
    • github.com/getsentry/raven-go: v0.2.0
    • github.com/go-kit/log: v0.1.0
    • github.com/gofrs/uuid: v4.0.0+incompatible
    • github.com/josharian/intern: v1.0.0
    • github.com/jpillora/backoff: v1.0.0
    • github.com/nxadm/tail: v1.4.4
    • github.com/opentracing/opentracing-go: v1.1.0
    • github.com/robfig/cron/v3: v3.0.1
    • github.com/stoewer/go-strcase: v1.2.0
    • go.etcd.io/etcd/api/v3: v3.5.0
    • go.etcd.io/etcd/client/pkg/v3: v3.5.0
    • go.etcd.io/etcd/client/v2: v2.305.0
    • go.etcd.io/etcd/client/v3: v3.5.0
    • go.etcd.io/etcd/pkg/v3: v3.5.0
    • go.etcd.io/etcd/raft/v3: v3.5.0
    • go.etcd.io/etcd/server/v3: v3.5.0
    • go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc: v0.20.0
    • go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp: v0.20.0
    • go.opentelemetry.io/contrib: v0.20.0
    • go.opentelemetry.io/otel/exporters/otlp: v0.20.0
    • go.opentelemetry.io/otel/metric: v0.20.0
    • go.opentelemetry.io/otel/oteltest: v0.20.0
    • go.opentelemetry.io/otel/sdk/export/metric: v0.20.0
    • go.opentelemetry.io/otel/sdk/metric: v0.20.0
    • go.opentelemetry.io/otel/sdk: v0.20.0
    • go.opentelemetry.io/otel/trace: v0.20.0
    • go.opentelemetry.io/otel: v0.20.0
    • go.opentelemetry.io/proto/otlp: v0.7.0
    • go.uber.org/goleak: v1.1.10

    Changed

    Removed

    • github.com/agnivade/levenshtein: v1.0.1
    • github.com/alecthomas/template: fb15b89
    • github.com/andreyvit/diff: c7f18ee
    • github.com/bifurcation/mint: 93c51c6
    • github.com/caddyserver/caddy: v1.0.3
    • github.com/cenkalti/backoff: v2.1.1+incompatible
    • github.com/checkpoint-restore/go-criu/v4: v4.1.0
    • github.com/cheekybits/genny: 9127e81
    • github.com/go-acme/lego: v2.5.0+incompatible
    • github.com/go-bindata/go-bindata: v3.1.1+incompatible
    • github.com/go-openapi/analysis: v0.19.5
    • github.com/go-openapi/errors: v0.19.2
    • github.com/go-openapi/loads: v0.19.4
    • github.com/go-openapi/runtime: v0.19.4
    • github.com/go-openapi/spec: v0.19.5
    • github.com/go-openapi/strfmt: v0.19.5
    • github.com/go-openapi/validate: v0.19.8
    • github.com/gobuffalo/here: v0.6.0
    • github.com/hpcloud/tail: v1.0.0
    • github.com/jimstudt/http-authentication: 3eca13d
    • github.com/klauspost/cpuid: v1.2.0
    • github.com/kr/logfmt: b84e30a
    • github.com/kylelemons/godebug: d65d576
    • github.com/lucas-clemente/aes12: cd47fb3
    • github.com/lucas-clemente/quic-clients: v0.1.0
    • github.com/lucas-clemente/quic-go-certificates: d2f8652
    • github.com/lucas-clemente/quic-go: v0.10.2
    • github.com/markbates/pkger: v0.17.1
    • github.com/marten-seemann/qtls: v0.2.3
    • github.com/mholt/certmagic: 6a42ef9
    • github.com/naoina/go-stringutil: v0.1.0
    • github.com/naoina/toml: v0.1.1
    • github.com/robfig/cron: v1.1.0
    • github.com/satori/go.uuid: v1.2.0
    • github.com/thecodeteam/goscaleio: v0.1.0
    • github.com/tidwall/pretty: v1.0.0
    • github.com/vektah/gqlparser: v1.1.2
    • github.com/willf/bitset: v1.1.11
    • go.etcd.io/etcd: dd1b699
    • go.mongodb.org/mongo-driver: v1.1.2
    • gopkg.in/cheggaaa/pb.v1: v1.0.25
    • gopkg.in/fsnotify.v1: v1.4.7
    • gopkg.in/mcuadros/go-syslog.v2: v2.2.1
    • gopkg.in/resty.v1: v1.12.0
    • k8s.io/heapster: v1.2.0-beta.1

    containerlinux 3033.2.2

    Breaking changes

    • CGroupsV2 are enabled by default. Applications might need to be updated if they don’t have support. There are several known issues:
      • Java applications must use JRE >= 15; Please see OpenJDK upstream issue for more details.

    Security fixes

    Bug fixes

    • SDK: Fixed build error popping up in the new SDK Container because policycoreutils used the wrong ROOT to update the SELinux store (flatcar-linux/coreos-overlay#1502)
    • Fixed leak of SELinux policy store to the root filesystem top directory due to wrong store path in policycoreutils instead of /var/lib/selinux (flatcar-linux/Flatcar#596)
    • Ensured that the /run/xtables.lock coordination file exists for modifications of the xtables backend from containers (must be bind-mounted) or the iptables-legacy binaries on the host (flatcar-linux/init#57)
    • dev container: Fix github URL for coreos-overlay and portage-stable to use repos from flatcar-linux org directly instead of relying on redirects from the kinvolk org. This fixes checkouts with emerge-gitclone inside dev-container. (flatcar-linux/scripts#194)
    • SDK: Fixed build error popping up in the new SDK Container because policycoreutils used the wrong ROOT to update the SELinux store (flatcar-linux/coreos-overlay#1502)
    • arm64: the Polkit service does not crash anymore. (flatcar-linux/Flatcar#156)
    • toolbox: fixed support for multi-layered docker images (toolbox#5)
    • Run emergency.target on ignition/torcx service unit failure in dracut (bootengine#28)
    • Fix vim warnings on missing file, when built with USE=”minimal” (portage-stable#260)
    • The Torcx profile docker-1.12-no got fixed to reference the current Docker version instead of 19.03 which wasn’t found on the image, causing Torcx to fail to provide Docker (PR#1456)
    • Use https protocol instead of git for Github URLs (flatcar-linux/coreos-overlay#1394)

    Changes

    • Backported elf support for iproute2 (flatcar-linux/coreos-overlay#1256)
    • Added GPIO support (coreos-overlay#1236)
    • Enabled SELinux in permissive mode on ARM64 (coreos-overlay#1245)
    • The iptables command uses the nftables kernel backend instead of the iptables backend, you can also migrate to using the nft tool instead of iptables. Containers with iptables binaries that use the iptables backend will result in mixing both kernel backends which is supported but you have to look up the rules separately (on the host you can use the iptables-legacy and friends).
    • Added missing SELinux rule as initial step to resolve Torcx unpacking issue (coreos-overlay#1426)

    Updates

    calico 3.21.3

    BGP Improvements

    For users of BGP you can now view the status of your BGP routers, including session status, RIB / FIB contents, and agent health via the new CalicoNodeStatus API. See the API documentation for more details.

    In addition, you can control BGP advertisement of certain prefixes using the new disableBGPExport option on each IP pool, allowing greater control of your route sharing scheme.

    Pull requests:

    • Added Calico node status resource (CalicoNodeStatus) which represents a collection of status information for a node that Calico reports back to the user for use during troubleshooting. libcalico-go #1502 (@song-jiang)
    • Report node BGP status from calico/node. node #1234 (@song-jiang)
    • Add new syncer for BGP status API. typha #662 (@song-jiang)
    • Don’t export BGP routes for IP pools that have disableBGPExport==true confd #647 (@coutinhop)

    Service-based network policy improvements

    In v3.20, we introduced egress policy rules that can match on Kubernetes services. In v3.21, we improved upon that in two ways. First, you can now use service matches in Calico NetworkPolicy and GlobalNetworkPolicy ingress rules. Second, you can now use service-based network policy rules on Windows nodes.

    Pull requests:

    • Policy ingress rules now support service selectors. felix #3024 (@mgleung)
    • Windows data plane support for Service-based network policy rules felix #2917 (@caseydavenport)
    • Allow services to be specified in the Source field of Ingress rules libcalico-go #1517 (@mgleung)

    Option to run Calico as non-privileged and non-root

    Calico can now optionally run in non-privileged and non-root mode, with some limitations. See the documentation for more information.

    Pull requests:

    • Change node and supporting binary permissions so that they can be run as a non-root user node #1224 (@mgleung)
    • CNI plugin now sets route_localnet=1 for container interfaces cni-plugin #1168 (@mgleung)
    • CNI plugins now have SUID bit set in order to run as non-root cni-plugin #1168 (@mgleung)

    IPReservations API

    You can use the new IPReservations API to reserve certain IP addresses so that they will not be used by Calico IPAM. This allows for fine-grained control of the IP space in your cluster.

    Pull requests:

    • Add support for IPReservations libcalico-go #1509 (@fasaxc)

    Bug fixes

    • Fix a serious regression introduced in v3.21.0 where the datastore watcher could get stuck and report stale information in clusters with >500 policies/pods/etc. The bug was triggered by needing to do a resync (for example after an etcd compaction) when there were enough resources to trigger the list pager. calico #5332 (@robbrockbank)
    • Pass ExceptUpgradeService param to stop-calico.ps1 as well node #1372 (@lmm)
    • Restrict Typha server to FIPS compliant cipher suites. typha #696 (@caseydavenport)
    • Fix log spam from Calico upgrade service for Windows node #1343 (@song-jiang)
    • Increase timeout for setting NetworkUnavailable on shutdown node #1341 (@caseydavenport)
    • Fix potential panic and memory leak in kube-controllers caused by adding and subsequently deleting IPAM blocks kube-controllers #912 (@caseydavenport)
    • IPAM GC correctly handles multiple IP addresses allocated with the same handle ID. kube-controllers #903 (@caseydavenport)
    • Fix bug where invalid port structures were being sent to Felix, preventing pods with hostPorts specified from working. libcalico-go #1545 (@caseydavenport)
    • Downgrade repetitive info level logging in calico/node autodetection code node #1237 (@caseydavenport)
    • Updated ubi base images and CentOS repos to stop CVE false positives from being reported. node #1136 (@coutinhop)
    • Fixed typo in umount command pod2daemon #64 (@ScheererJ)
    • Fixes this bug which caused WireGuard stats to be collected even when WireGuard was disabled. Additionally, the version of the wgctrl dependency has been updated as the previous version caused thread leaks. felix #3057 (@mikestephen)
    • Fix blackhole route table interface matches to handle empty interface regexes. felix #3007 (@robbrockbank)
    • Fix slow performance when updating a Kubernetes namespace when there are many Pods (and in turn, slow startup performance when there are many namespaces). felix #2964 (@fasaxc)
    • Close race condition that could result in an extra IPAM block being allocated to a node. libcalico-go #1488 (@caseydavenport)
    • Fix that podIP annotation could be incorrectly clobbered for stateful set pods: https://github.com/projectcalico/calico/issues/4710 libcalico-go #1472 (@fasaxc)
    • Fix removal of old CNI configuration on name-change cni-plugin #1153 (@caseydavenport)
    • Readiness depends on all syncers typha #613 (@robbrockbank)
    • Exclude RR nodes from BGP full mesh confd #619 (@coutinhop)
    • Fixed a bug in ExternalTrafficPolicy=Local that lead to connection stalling. felix #3015 (@tomastigera)
    • Fixed broken connections when client used the same port to connect to the same backed via a nodeport on different nodes. felix #2983 (@tomastigera)
    • The eBPF mode implementation of DoNotTrack policy was incorrectly allowing an inbound connection through a HostEndpoint, when the HostEndpoint had DoNotTrack policy for the ingress direction but not for egress. For precise compatibility with Calico’s established DoNotTrack semantics, that connection should be disallowed, and now is. (Because of the lack of connection tracking, successful use of DoNotTrack policy to allow flows requires configuring the DoNotTrack policy symmetrically in both directions.) felix #2982 (@neiljerram)

    Other changes

    • Replace github.com/dgrijalva/jwt-go with active fork github.com/golang-jwt/jwt that resolves vulnerability flagged by scanners. libcalico-go #1554 (@lmm)
    • calico/node logs write to /var/log/calico within the container by default, in addition to stdout node #1133 (@song-jiang)
    • Read pod IP information from Amazon VPC CNI annotation, if present on the pod. libcalico-go #1523 (@caseydavenport)
    • Update etcd client version to v3.5.0 libcalico-go #1495 (@Aceralon)
    • Optimize lists and watches made against the Kubernetes API libcalico-go #1484 (@caseydavenport)
    • WorkloadEndpoints now support hostPorts libcalico-go #1471 (@AloysAugustin)
    • Include CNI plugin release v1.0.0 cni-plugin #1141 (@caseydavenport)
    • Allow configuration of num_queues for Calico created veth interfaces cni-plugin #1116 (@arikachen)
    • Typha now gives newly connected clients an extra grace period to catch up after sending the snapshot to reduce the possibility of cyclic disconnects. typha #614 (@fasaxc)
    • Add calico-node upgrade service for upgrades on Windows node #1254 (@lmm)
    • eBPF arm64/aarch64 node #1044 (@frozenprocess)
    • BPF: Endpoints in EndpointsSlices that are not ready are excluded from NAT felix #3017 (@tomastigera)
    • Calico’s eBPF dataplane now fully implements DoNotTrack policy felix #2910 (@neiljerram)
    • Add HostPort support in the gRPC dataplane cni-plugin #1119 (@AloysAugustin)

    app-operator 5.6.0

    Added

    • Support watching app CRs in organization namespace with cluster label selector.

    Changed

    • Get tarball URL for chart CRs from index.yaml for better community app catalog support.

    Fixed

    • Embed Chart CRD in app-operator to prevent hitting GitHub API rate limits.
    • When bootstrapping chart-operator the helm release should not include the cluster ID.
    • Fix getting kubeconfig in chart CR watcher.
    • Fix error handling in chart CR watcher when chart CRD not installed.

    aws-cni 1.10.2

    Upgraded from version 1.10.1. Please check upstream changelog for details.

    aws-ebs-csi-driver 2.8.1

    Fixed

    • Use node selector according to control-plane and nodepool labels.

    aws-operator 10.17.0

    Added

    • New flatcar releases.
    • Add support for feature that enables forcing cgroups v1 for Flatcar version 3033.2.0 and above.

    Changed

    • Bumped k8scloudconfig to disable rpc-statd service.
    • Max pods setting per for new EC2 instances.
    • Bump etcd-cluster-migrator version to v1.1.0.
    • Bump k8scloudconfig version to v11.0.1.
    • Changes to EncryptionConfig in order to fully work with encryption-provider-operator.

    Fixed

    • Autoselect region ARN for ebs snapshots.

    cluster-autoscaler 1.22.2-gs3

    Added

    • Added support for specifying balance-similar-node-groups flag

    Changed

    • Updated cluster-autoscaler to version 1.22.2.

    Fixed

    • Fix RBAC for version 1.22.

    external-dns 2.9.0

    This release contains some changes to mitigate rate limiting on AWS clusters. Please take note of the defaults for values aws.batchChangeInterval, aws.zonesCacheDuration, externalDNS.interval and externalDNS.minEventSyncInterval.

    If you already specify --aws-batch-change-interval or --aws-zones-cache-duration, please migrate to the new values aws.batchChangeInterval and aws.zonesCacheDuration.

    Added

    • Allow to set --aws-batch-change-interval through aws.batchChangeInterval value. Default 10s.
    • Allow to set --aws-zones-cache-duration through aws.zonesCacheDuration value. Default 3h.

    Changed

    • Set default externalDNS.interval to 5m.
    • Set default externalDNS.minEventSyncInterval to 30s.
    • Allow setting Route53 credentials (externalDNS.aws_access_key_id and externalDNS.aws_secret_access_key) indepentent from aws.access value.
    • Allow setting the AWS default region (aws.region) indepentent from aws.access value.
    • Allow to omit the --domain-filter flag completely by setting externalDNS.domainFilterList to null.
    • Add ability to specify extra arguments to the external-dns deployment through externalDNS.extraArgs.

    kiam-watchdog 0.5.1

    Added

    • Added --probe-mode flag to allow using either ‘route53’ or ‘sts’ to probe AWS API.

    Fixed

    • Fix sts successful check.

    kube-state-metrics 1.7.0

    Changed

    • Raise priorityClass to system-cluster-critical to increase scheduling chances in master-only clusters.

    Fixed

    • Fixed missing labels from kube__labels

    vertical-pod-autoscaler 2.1.1

    Fixed

    • Fix naming of VPA deployments in workload clusters.