Workload cluster releases for Azure

  • This is a release to fix a bug that could lead to azure-cloud-controller-manager app to be configured with the wrong Pod CIDR. Upgrading to this release from 18.0.0 will not require rolling of nodes.

    Change details

    cluster-operator 4.6.2

    Fixed

    • Use AzureConfig’s Spec.Azure.VirtualNetwork.CalicoSubnetCIDR field for Calico CIDR rather than Spec.Cluster.Calico.Subnet.
  • This is the first Azure release featuring Kubernetes 1.23. Furthermore, this release is the first to use out-of-tree controller manager and CSI providers.

    Change details

    azure-operator 6.0.2

    Fixed

    • Ensure giantswarm system user is always created first.

    Changed

    • Support for k8s 1.23.
    • Config changes needed to run out-of-tree controller manager and CSI providers.

    kubernetes 1.23.9

    Bug or Regression

    • Fix a bug that caused the wrong result length when using –chunk-size and –selector together (#110757, @Abirdcfly) [SIG API Machinery and Testing]
    • Fix bug that prevented the job controller from enforcing activeDeadlineSeconds when set (#110545, @harshanarayana) [SIG Apps]
    • Fix image pulling failure when IMDS is unavailable in kubelet startup (#110523, @andyzhangx) [SIG Cloud Provider]
    • Fix printing resources with int64 fields (#110602, @sanchezl) [SIG API Machinery]
    • Fixed a regression introduced in 1.23.0 where Azure load balancers were not kept up to date with the state of cluster nodes. In particular, nodes that are not in the ready state and are not newly created (i.e. not having the node.cloudprovider.kubernetes.io/uninitialized taint) now get removed from Azure load balancers. (#109932, @ricky-rav) [SIG Cloud Provider]
    • Fixed potential scheduler crash when scheduling with unsatisfied nodes in PodTopologySpread. (#110853, @kerthcet) [SIG Scheduling]
    • Kubeadm: fix the bug that configurable KubernetesVersion not respected during kubeadm join (#111022, @SataQiu) [SIG Cluster Lifecycle]
    • Reduced time taken to sync proxy rules on Windows kube-proxy with kernelspace mode (#110702, @daschott) [SIG Network and Windows]
    • Updated cAdvisor to v0.43.1 to pick up a kubelet fix where network metrics can be missing in some cases when used with containerd (#111013, @bobbypage) [SIG Node]

    Dependencies

    Added

    Nothing has changed.

    Changed

    Removed

    Nothing has changed.

    chart-operator 2.29.0

    Upgraded from version 2.24.1.

    Changed

    • Added support for k8s 1.24.
    • Allow running on clusters where critical workloads are not running yet.

    Fixed

    • Better handling of priority class creation.

    coredns 1.11.0

    Changed

    • Update coredns to upstream version 1.9.3.

    external-dns 2.15.2

    Changed

    • Bumped base image to address security issues.

    kube-state-metrics 1.12.1

    Changed

    • Bump to upstream version 2.5.0.

    metrics-server 1.8.0

    Changed

    • Updated metrics-server version to 0.6.1.

    cluster-autoscaler 1.23.1

    Changed

    • Update cluster-autoscaler to version 1.23.1.

    vertical-pod-autoscaler 2.5.0

    Changed

    • Upgrade vertical-pod-autoscaler to 0.11.0

      Potentially breaking change:

      • Added validation - CPU values will be accepted only with resolution of 1 mCPU, memory with resolution of 1 b

      Other changes:

      • Switch to go 1.16
      • Admission controller now logs when it fails to start
      • Increase resolution of admission_latency_seconds metric
      • Reduce verbosity of some logs

    app-operator 6.3.0

    Added

    • Added support for k8s 1.24.
    • Watch config maps and secrets listed in the extraConfigs section of App CR for multi layer configs, see: https://github.com/giantswarm/rfc/tree/main/multi-layer-app-config#enhancing-app-cr
    • If no userconfig configmap or secret reference is specified but one is found following the default naming convention (*-user-values / *-user-secrets) then the App resource is updated to reference the found configmap/secret.

    cluster-operator 4.6.1

    Changed

    • Enable cni.install mode for Chart Operator.

    Fixed

    • Fix handling of managed apps renamed to renove the -app suffix.

    calico 3.21.6

    Please refer to the upstream changelog.

    containerlinux 3227.2.1

    New Stable Release 3227.2.1

    Changes since Stable 3227.2.0

    Security fixes:

    Bug fixes:

    • Added support for Openstack for cloud-init activation (flatcar-linux/init#76)
    • Excluded Wireguard interface from systemd-networkd default management (Flatcar#808)
    • Fixed /etc/resolv.conf symlink by pointing it at resolv.conf instead of stub-resolv.conf. This bug was present since the update to systemd v250 (coreos-overlay#2057)
    • Fixed excluded interface type from default systemd-networkd configuration (flatcar-linux/init#78)
    • Fixed space escaping in the networkd Ignition translation (Flatcar#812)

    Updates:

    kubernetes 1.23.9

    Upgraded from version 1.22. Please refer to the official changelog for all details.

    What’s New (Major Themes)

    Deprecation of FlexVolume

    FlexVolume is deprecated. Out-of-tree CSI driver is the recommended way to write volume drivers in Kubernetes. See this doc for more information. Maintainers of FlexVolume drivers should implement a CSI driver and move users of FlexVolume to CSI. Users of FlexVolume should move their workloads to CSI driver.

    Deprecation of klog specific flags

    To simplify the code base, several logging flags got marked as deprecated in Kubernetes 1.23. The code which implements them will be removed in a future release, so users of those need to start replacing the deprecated flags with some alternative solutions.

    Software Supply Chain SLSA Level 1 Compliance in the Kubernetes Release Process

    Kubernetes releases are now generating provenance attestation files describing the staging and release phases of the release process and artifacts are verified as they are handed over from one phase to the next. This final piece completes the work needed to comply with Level 1 of the SLSA security framework (Supply-chain Levels for Software Artifacts).

    IPv4/IPv6 Dual-stack Networking graduates to GA

    IPv4/IPv6 dual-stack networking graduates to GA. Since 1.21, Kubernetes clusters are enabled to support dual-stack networking by default. In 1.23, the IPv6DualStack feature gate is removed. The use of dual-stack networking is not mandatory. Although clusters are enabled to support dual-stack networking, Pods and Services continue to default to single-stack. To use dual-stack networking: Kubernetes nodes have routable IPv4/IPv6 network interfaces, a dual-stack capable CNI network plugin is used, Pods are configured to be dual-stack and Services have their .spec.ipFamilyPolicy field set to either PreferDualStack or RequireDualStack.

    HorizontalPodAutoscaler v2 graduates to GA

    Version 2 of the HorizontalPodAutoscaler API graduates to stable in the 1.23 release. The HorizontalPodAutoscaler autoscaling/v2beta2 API is deprecated in favor of the new autoscaling/v2 API, which the Kubernetes project recommends for all use cases.

    This release does not deprecate the v1 HorizontalPodAutoscaler API.

    Generic Ephemeral Volume feature graduates to GA

    The generic ephemeral volume feature moved to GA in 1.23. This feature allows any existing storage driver that supports dynamic provisioning to be used as an ephemeral volume with the volume’s lifecycle bound to the Pod. All StorageClass parameters for volume provisioning and all features supported with PersistentVolumeClaims are supported.

    Skip Volume Ownership change graduates to GA

    The feature to configure volume permission and ownership change policy for Pods moved to GA in 1.23. This allows users to skip recursive permission changes on mount and speeds up the pod start up time.

    Allow CSI drivers to opt-in to volume ownership and permission change graduates to GA

    The feature to allow CSI Drivers to declare support for fsGroup based permissions graduates to GA in 1.23.

    PodSecurity graduates to Beta

    PodSecurity moves to Beta. PodSecurity replaces the deprecated PodSecurityPolicy admission controller. PodSecurity is an admission controller that enforces Pod Security Standards on Pods in a Namespace based on specific namespace labels that set the enforcement level. In 1.23, the PodSecurity feature gate is enabled by default.

    Container Runtime Interface (CRI) v1 is default

    The Kubelet now supports the CRI v1 API, which is now the project-wide default. If a container runtime does not support the v1 API, Kubernetes will fall back to the v1alpha2 implementation. There is no intermediate action required by end-users, because v1 and v1alpha2 do not differ in their implementation. It is likely that v1alpha2 will be removed in one of the future Kubernetes releases to be able to develop v1.

    Structured logging graduate to Beta

    Structured logging reached its Beta milestone. Most log messages from kubelet and kube-scheduler have been converted. Users are encouraged to try out JSON output or parsing of the structured text format and provide feedback on possible solutions for the open issues, such as handling of multi-line strings in log values.

    Simplified Multi-point plugin configuration for scheduler

    The kube-scheduler is adding a new, simplified config field for Plugins to allow multiple extension points to be enabled in one spot. The new multiPoint plugin field is intended to simplify most scheduler setups for administrators. Plugins that are enabled via multiPoint will automatically be registered for each individual extension point that they implement. For example, a plugin that implements Score and Filter extensions can be simultaneously enabled for both. This means entire plugins can be enabled and disabled without having to manually edit individual extension point settings. These extension points can now be abstracted away due to their irrelevance for most users.

    CSI Migration updates

    CSI Migration enables the replacement of existing in-tree storage plugins such as kubernetes.io/gce-pd or kubernetes.io/aws-ebs with a corresponding CSI driver. If CSI Migration is working properly, Kubernetes end users shouldn’t notice a difference. After migration, Kubernetes users may continue to rely on all the functionality of in-tree storage plugins using the existing interface.

    • CSI Migration feature is turned on by default but stays in Beta for GCE PD, AWS EBS, and Azure Disk in 1.23.
    • CSI Migration is introduced as an Alpha feature for Ceph RBD and Portworx in 1.23.

    Urgent Upgrade Notes

    (No, really, you MUST read this before you upgrade)
    • Kubeadm: remove the deprecated flag --experimental-patches for the init|join|upgrade commands. The flag --patches is no longer allowed in a mixture with the flag --config. Please use the kubeadm configuration for setting patches for a node using {Init|Join}Configuration.patches. (#104065, @pacoxu)
    • Log messages in JSON format are written to stderr by default now (same as text format) instead of stdout. Users who expected JSON output on stdout must now capture stderr instead or in addition to stdout. (#106146, @pohly) [SIG API Machinery, Architecture, Cluster Lifecycle and Instrumentation]
    • Support for the seccomp annotations seccomp.security.alpha.kubernetes.io/pod and container.seccomp.security.alpha.kubernetes.io/[name] has been deprecated since 1.19, will be dropped in 1.25. Transition to using the seccompProfile API field. (#104389, @saschagrunert)
    • kube-log-runner is included in release tar balls. It can be used to replace the deprecated --log-file parameter. (#106123, @pohly) [SIG API Machinery, Architecture, Cloud Provider, Cluster Lifecycle and Instrumentation]
    • Kubernetes is built using golang 1.17. This version of go removes the ability to use a GODEBUG=x509ignoreCN=0 environment setting to re-enable deprecated legacy behavior of treating the CommonName of X.509 serving certificates as a host name. This behavior has been disabled by default since Kubernetes 1.19 / go 1.15. Serving certificates used by admission webhooks, custom resource conversion webhooks, and aggregated API servers must now include valid Subject Alternative Names. If you are running Kubernetes 1.22 with GODEBUG=x509ignoreCN=0 set, check the apiserver_kube_aggregator_x509_missing_san_total and apiserver_webhooks_x509_missing_san_total metrics for non-zero values to see if the API server is connecting to webhooks or aggregated API servers using certificates that will be considered invalid in Kubernetes 1.23+.

    Known Issues

    Etcd v3.5.[0-2] data corruption

    Data corruption issue was found in etcd v3.5.0 release that was shipped with 1.22 Kubernetes release. Please read up-to-date production recommendations for etcd.

    Deprecation
    • A deprecation notice has been added when using the kube-proxy userspace proxier, which will be removed in v1.25. (#103860) (#104631, @perithompson)
    • Added apiserver_longrunning_requests metric to replace the soon to be deprecated apiserver_longrunning_gauge metric. (#103799, @jyz0309)
    • Controller-manager: the following flags have no effect and would be removed in v1.24:
      • --port
      • --address The insecure port flags --port may only be set to 0 now.
    • Kube-scheduler: the --port and --address flags have no effect and would be removed in v1.24. The insecure port flags --port may only be set to 0 now. Also metricsBindAddress and healthzBindAddress fields from kubescheduler.config.k8s.io/v1beta1 are no-op and expected to be empty. Removed in kubescheduler.config.k8s.io/v1beta2 completely. (#96345, @ingvagabund) In addition, please be careful that:
      • kube-scheduler MUST start with --authorization-kubeconfig and --authentication-kubeconfig correctly set to get authentication/authorization working.
      • liveness/readiness probes to kube-scheduler MUST use HTTPS now, and the default port has been changed to 10259.
      • Applications that fetch metrics from kube-scheduler should use a dedicated service account which is allowed to access nonResourceURLs /metrics. (#96345, @ingvagabund) [SIG Cloud Provider, Scheduling and Testing]
    • Feature-gate VolumeSubpath has been deprecated and cannot be disabled. It will be completely removed in 1.25 (#105474, @mauriciopoppe)
    • Kubeadm: add a new output/v1alpha2 API that is identical to the output/v1alpha1, but attempts to resolve some internal dependencies with the kubeadm/v1beta2 API. The output/v1alpha1 API is now deprecated and will be removed in a future release. (#105295, @neolit123)
    • Kubeadm: add the kubeadm specific, Alpha (disabled by default) feature gate UnversionedKubeletConfigMap. When this feature is enabled kubeadm will start using a new naming format for the ConfigMap where it stores the KubeletConfiguration structure. The old format included the Kubernetes version - “kube-system/kubelet-config-1.22”, while the new format does not - “kube-system/kubelet-config”. A similar formatting change is done for the related RBAC rules. The old format is now DEPRECATED and will be removed after the feature graduates to GA. When writing the ConfigMap kubeadm (init, upgrade apply) will respect the value of UnversionedKubeletConfigMap, while when reading it (join, reset, upgrade), it would attempt to use new format first and fallback to the legacy format if needed. (#105741, @neolit123) [SIG Cluster Lifecycle and Testing]
    • Kubeadm: remove the deprecated / NO-OP phase update-cluster-status in kubeadm reset (#105888, @neolit123)
    • Remove ‘master’ as a valid EgressSelection type in the EgressSelectorConfiguration API. (#102242, @pacoxu)
    • Removed kubectl --dry-run empty default value and boolean values. kubectl --dry-run usage must be specified with --dry-run=(server|client|none). (#105327, @julianvmodesto)
    • Removed deprecated metric scheduler_volume_scheduling_duration_seconds. (#104518, @dntosas)
    • The deprecated --experimental-bootstrap-kubeconfig flag has been removed. This can be set via --bootstrap-kubeconfig. (#103172, @niulechuan)

    API Change

    • A new field omitManagedFields has been added to both audit.Policy and audit.PolicyRule so cluster operators can opt in to omit managed fields of the request and response bodies from being written to the API audit log. (#94986, @tkashem) [SIG API Machinery, Auth, Cloud Provider and Testing]
    • A small regression in Service updates was fixed. The circumstances are so unlikely that probably nobody would ever hit it. (#104601, @thockin)
    • Added a feature gate StatefulSetAutoDeletePVC, which allows PVCs automatically created for StatefulSet pods to be automatically deleted. (#99728, @mattcary)
    • Client-go impersonation config can specify a UID to pass impersonated uid information through in requests. (#104483, @margocrawf)
    • Create HPA v2 from v2beta2 with some fields changed. (#102534, @wangyysde) [SIG API Machinery, Apps, Auth, Autoscaling and Testing]
    • Ephemeral containers graduated to beta and are now available by default. (#105405, @verb)
    • Fix kube-proxy regression on UDP services because the logic to detect stale connections was not considering if the endpoint was ready. (#106163, @aojea) [SIG API Machinery, Apps, Architecture, Auth, Autoscaling, CLI, Cloud Provider, Contributor Experience, Instrumentation, Network, Node, Release, Scalability, Scheduling, Storage, Testing and Windows]
    • If a conflict occurs when creating an object with generateName, the server now returns an “AlreadyExists” error with a retry option. (#104699, @vincepri)
    • Implement support for recovering from volume expansion failures (#106154, @gnufied) [SIG API Machinery, Apps and Storage]
    • In kubelet, log verbosity and flush frequency can also be configured via the configuration file and not just via command line flags. In other commands (kube-apiserver, kube-controller-manager), the flags are listed in the “Logs flags” group and not under “Global” or “Misc”. The type for -vmodule was made a bit more descriptive (pattern=N,... instead of moduleSpec). (#106090, @pohly) [SIG API Machinery, Architecture, CLI, Cluster Lifecycle, Instrumentation, Node and Scheduling]
    • Introduce OS field in the PodSpec (#104693, @ravisantoshgudimetla)
    • Introduce v1beta3 API for scheduler. This version
      • increases the weight of user specifiable priorities. The weights of following priority plugins are increased

        • TaintTolerations to 3 - as leveraging node tainting to group nodes in the cluster is becoming a widely-adopted practice
        • NodeAffinity to 2
        • InterPodAffinity to 2
      • Won’t have HealthzBindAddress, MetricsBindAddress fields (#104251, @ravisantoshgudimetla)

    • Introduce v1beta2 for Priority and Fairness with no changes in API spec. (#104399, @tkashem)
    • JSON log output is configurable and now supports writing info messages to stdout and error messages to stderr. Info messages can be buffered in memory. The default is to write both to stdout without buffering, as before. (#104873, @pohly)
    • JobTrackingWithFinalizers graduates to beta. Feature is enabled by default. (#105687, @alculquicondor)
    • Kube-apiserver: Fixes handling of CRD schemas containing literal null values in enums. (#104969, @liggitt)
    • Kube-apiserver: The rbac.authorization.k8s.io/v1alpha1 API version is removed; use the rbac.authorization.k8s.io/v1 API, available since v1.8. The scheduling.k8s.io/v1alpha1 API version is removed; use the scheduling.k8s.io/v1 API, available since v1.14. (#104248, @liggitt)
    • Kube-scheduler: support for configuration file version v1beta1 is removed. Update configuration files to v1beta2(xref: https://github.com/kubernetes/enhancements/issues/2901) or v1beta3 before upgrading to 1.23. (#104782, @kerthcet)
    • KubeSchedulerConfiguration provides a new field MultiPoint which will register a plugin for all valid extension points (#105611, @damemi) [SIG Scheduling and Testing]
    • Kubelet should reject pods whose OS doesn’t match the node’s OS label. (#105292, @ravisantoshgudimetla) [SIG Apps and Node]
    • Kubelet: turn the KubeletConfiguration v1beta1 ResolverConfig field from a string to *string. (#104624, @Haleygo)
    • Kubernetes is now built using go 1.17. (#103692, @justaugustus)
    • Performs strict server side schema validation requests via the fieldValidation=[Strict,Warn,Ignore]. (#105916, @kevindelgado)
    • Promote IPv6DualStack feature to stable. Controller Manager flags for the node IPAM controller have slightly changed:
      1. When configuring a dual-stack cluster, the user must specify both --node-cidr-mask-size-ipv4 and --node-cidr-mask-size-ipv6 to set the per-node IP mask sizes, instead of the previous --node-cidr-mask-size flag.
      2. The --node-cidr-mask-size flag is mutually exclusive with --node-cidr-mask-size-ipv4 and --node-cidr-mask-size-ipv6.
      3. Single-stack clusters do not need to change, but may choose to use the more specific flags. Users can use either the older --node-cidr-mask-size flag or one of the newer --node-cidr-mask-size-ipv4 or --node-cidr-mask-size-ipv6 flags to configure the per-node IP mask size, provided that the flag’s IP family matches the cluster’s IP family (–cluster-cidr). (#104691, @khenidak)
    • Remove NodeLease feature gate that was graduated and locked to stable in 1.17 release. (#105222, @cyclinder)
    • Removed deprecated --seccomp-profile-root/seccompProfileRoot config. (#103941, @saschagrunert)
    • Since golang 1.17 both net.ParseIP and net.ParseCIDR rejects leading zeros in the dot-decimal notation of IPv4 addresses, Kubernetes will keep allowing leading zeros on IPv4 address to not break the compatibility. IMPORTANT: Kubernetes interprets leading zeros on IPv4 addresses as decimal, users must not rely on parser alignment to not being impacted by the associated security advisory: CVE-2021-29923 golang standard library “net” - Improper Input Validation of octal literals in golang 1.16.2 and below standard library “net” results in indeterminate SSRF & RFI vulnerabilities. Reference: https://nvd.nist.gov/vuln/detail/CVE-2021-29923 (#104368, @aojea)
    • StatefulSet minReadySeconds is promoted to beta. (#104045, @ravisantoshgudimetla)
    • Support pod priority based node graceful shutdown. (#102915, @wzshiming)
    • The “Generic Ephemeral Volume” feature graduates to GA. It is now enabled unconditionally. (#105609, @pohly)
    • The Kubelet’s --register-with-taints option is now available via the Kubelet config file field registerWithTaints (#105437, @cmssczy) [SIG Node and Scalability]
    • The CSIDriver.Spec.StorageCapacity can now be modified. (#101789, @pohly)
    • The CSIVolumeFSGroupPolicy feature has moved from beta to GA. (#105940, @dobsonj)
    • The IngressClass.Spec.Parameters.Namespace field is now GA. (#104636, @hbagdi)
    • The Service.spec.ipFamilyPolicy field is now required in order to create or update a Service as dual-stack. This is a breaking change from the beta behavior. Previously the server would try to infer the value of that field from either ipFamilies or clusterIPs, but that caused ambiguity on updates. Users who want a dual-stack Service MUST specify ipFamilyPolicy as either “PreferDualStack” or “RequireDualStack”. (#96684, @thockin)
    • The TTLAfterFinished feature gate is now GA and enabled by default. (#105219, @sahilvv)
    • The kube-controller-manager supports --concurrent-ephemeralvolume-syncs flag to set the number of ephemeral volume controller workers. (#102981, @SataQiu)
    • The legacy scheduler policy config is removed in v1.23, the associated flags policy-config-file, policy-configmap, policy-configmap-namespace and use-legacy-policy-config are also removed. Migrate to Component Config instead, see https://kubernetes.io/docs/reference/scheduling/config/ for details. (#105424, @kerthcet)
    • Track the number of Pods with a Ready condition in Job status. The feature is alpha and needs the feature gate JobReadyPods to be enabled. (#104915, @alculquicondor)
    • Users of LogFormatRegistry in component-base must update their code to use the logr v1.0.0 API. The JSON log output now uses the format from go-logr/zapr (no v field for error messages, additional information for invalid calls) and has some fixes (correct source code location for warnings about invalid log calls). (#104103, @pohly)
    • Validation rules for Custom Resource Definitions can be written in the CEL expression language using the x-kubernetes-validations extension in OpenAPIv3 schemas (alpha). This is gated by the alpha “CustomResourceValidationExpressions” feature gate. (#106051, @jpbetz) [SIG API Machinery, Architecture, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Node, Storage and Testing]
    • Fix OpenAPI serialization of the x-kubernetes-validations field (#108030, @liggitt) [SIG API Machinery]
    • Fixes a regression in v1beta1 PodDisruptionBudget handling of “strategic merge patch”-type API requests for the selector field. Prior to 1.21, these requests would merge matchLabels content and replace matchExpressions content. In 1.21, patch requests touching the selector field started replacing the entire selector. This is consistent with server-side apply and the v1 PodDisruptionBudget behavior, but should not have been changed for v1beta1. (#108139, @liggitt) [SIG Auth and Testing]
    • Omits alpha-level enums from the static openapi file captured in api/openapi-spec (#109179, @liggitt) [SIG Apps and Auth]
    • Sets JobTrackingWithFinalizers, beta feature, as disabled by default, due to unresolved bug https://github.com/kubernetes/kubernetes/issues/109485 (#109491, @alculquicondor) [SIG Apps, Auth, CLI, Network, Node, Scheduling, Storage and Testing]

    Feature

    • (beta feature) If the CSI driver supports the NodeServiceCapability VOLUME_MOUNT_GROUP and the DelegateFSGroupToCSIDriver feature gate is enabled, kubelet will delegate applying FSGroup to the driver by passing it to NodeStageVolume and NodePublishVolume, regardless of what other FSGroup policies are set. (#106330, @verult) [SIG Storage]

    • Add a new distribute-cpus-across-numa option to the static CPUManager policy. When enabled, this will trigger the CPUManager to evenly distribute CPUs across NUMA nodes in cases where more than one NUMA node is required to satisfy the allocation. (#105631, @klueska)

    • Add fish shell completion to kubectl. (#92989, @WLun001)

    • Add mechanism to load simple sniffer class into fluentd-elasticsearch image (#92853, @cosmo0920)

    • Add support for Portworx plugin to csi-translation-lib. Alpha release

      Portworx CSI driver is required to enable migration. This PR adds support of the CSIMigrationPortworx feature gate, which can be enabled by:

      1. Adding the feature flag to the kube-controller-manager --feature-gates=CSIMigrationPortworx=true
      2. Adding the feature flag to the kubelet config:

      featureGates: CSIMigrationPortworx: true (#103447, @trierra) [SIG API Machinery, Apps, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Network, Node, Release, Scalability, Scheduling, Storage, Testing and Windows]

    • Add support to generate client-side binaries for windows/arm64 platform (#104894, @pacoxu)

    • Added PowerShell completion generation by running kubectl completion powershell. (#103758, @zikhan)

    • Added a Processing condition for the workqueue API. Changed Shutdown for the workqueue API to wait until the work queue finishes processing all in-flight items. (#101928, @alexanderConstantinescu)

    • Added a new feature gate CustomResourceValidationExpressions to enable expression validation for Custom Resource. (#105107, @cici37)

    • Added a new flag --append-server-path to kubectl proxy that will automatically append the kube context server path to each request. (#97350, @FabianKramm)

    • Added ability for kubectl wait to wait on arbitary JSON path (#105776, @lauchokyip)

    • Added support for PodAndContainerStatsFromCRI feature gate, which allows a user to specify their pod stats must also come from the CRI, not cAdvisor. (#103095, @haircommander)

    • Added support for setting controller-manager log level online. (#104571, @h4ghhh)

    • Added the ability to specify whether to use an RFC7396 JSON Merge Patch, an RFC6902 JSON Patch, or a Strategic Merge Patch to perform an override of the resources created by kubectl run and kubectl expose. (#105140, @brianpursley)

    • Adding option for kubectl cp to resume on network errors until completion, requires tar in addition to tail inside the container image (#104792, @matthyx)

    • Adding support for multiple --from-env-file flags. (#104232, @lauchokyip)

    • Adding support for multiple --from-env-file flags. (#101646, @lauchokyip)

    • Adds --as-uid flag to kubectl to allow uid impersonation in the same way as user and group impersonation. (#105794, @margocrawf)

    • Adds new [alpha] command ‘kubectl events’ (#99557, @bboreham)

    • Allow node expansion of local volumes. (#102886, @gnufied)

    • Allow to build kubernetes with a custom kube-cross image. (#104185, @dims)

    • Allows users to prevent garbage collection on pinned images (#103299, @wgahnagl) [SIG Node]

    • CRI v1 is now the project default. If a container runtime does not support the v1 API, Kubernetes will fall back to the v1alpha2 implementation. (#106501, @ehashman)

    • Changed feature CSIMigrationAWS to on by default. This feature requires the AWS EBS CSI driver to be installed. (#106098, @wongma7)

    • Client-go: pass DeleteOptions down to the fake client Reactor (#102945, @chenchun)

    • Cloud providers can set service account names for cloud controllers. (#103178, @nckturner)

    • Display Labels when kubectl describe ingress. (#103894, @kabab)

    • Enhance scheduler VolumeBinding plugin to handle Lost PVC as UnschedulableAndUnresolvable (#105245, @yibozhuang)

    • Ensures that volume is deleted from the storage backend when the user tries to delete the PV object manually and the PV ReclaimPolicy is set to Delete. (#105773, @deepakkinni)

    • Expose a NewUnstructuredExtractor from apply configurations meta/v1 package that enables extracting objects into unstructured apply configurations. (#103564, @kevindelgado)

    • Feature gate StorageObjectInUseProtection has been deprecated and cannot be disabled. It will be completely removed in 1.25 (#105495, @ikeeip)

    • Graduating controller_admission_duration_seconds, step_admission_duration_seconds, webhook_admission_duration_seconds, apiserver_current_inflight_requests and apiserver_response_sizes metrics to stable. (#106122, @rezakrimi) [SIG API Machinery, Instrumentation and Testing]

    • Graduating pending_pods, preemption_attempts_total, preemption_victims and schedule_attempts_total metrics to stable. Also e2e_scheduling_duration_seconds is renamed to scheduling_attempt_duration_seconds and the latter is graduated to stable. (#105941, @rezakrimi) [SIG Instrumentation, Scheduling and Testing]

    • Health check of kube-controller-manager now includes each controller. (#104667, @jiahuif)

    • Integration testing now takes periodic Prometheus scrapes from the etcd server. There is a new script ,hack/run-prometheus-on-etcd-scrapes.sh, that runs a containerized Prometheus server against an archive of such scrapes. (#106190, @MikeSpreitzer) [SIG API Machinery and Testing]

    • Introduce a feature gate DisableKubeletCloudCredentialProviders which allows disabling the in-tree kubelet credential providers.

      The feature gate DisableKubeletCloudCredentialProviders is currently in Alpha, which means is currently disabled by default. Once this feature gate moves to beta, in-tree credential providers will be disabled by default, and users will need to migrate to use external credential providers. (#102507, @ostrain)

    • Introduces a new metric: admission_webhook_request_total with the following labels: name (string) - the webhook name, type (string) - the admission type, operation (string) - the requested verb, code (int) - the HTTP status code, rejected (bool) - whether the request was rejected, namespace (string) - the namespace of the requested resource. (#103162, @rmoriar1)

    • Kubeadm: add support for dry running kubeadm join. The new flag kubeadm join --dry-run is similar to the existing flag for kubeadm init/upgrade and allows you to see what changes would be applied. (#103027, @Haleygo)

    • Kubeadm: do not check if the /etc/kubernetes/manifests folder is empty on joining worker nodes during preflight (#104942, @SataQiu)

    • Kubectl will now provide shell completion choices for the --output/-o flag (#105851, @marckhouzam)

    • Kubelet should reconcile kubernetes.io/os and kubernetes.io/arch labels on the node object. The side-effect of this is kubelet would deny admission to pod which has nodeSelector with label kubernetes.io/os or kubernetes.io/arch which doesn’t match the underlying OS or arch on the host OS.

      • The label reconciliation happens as part of periodic status update which can be configured via flag --node-status-update-frequency (#104613, @ravisantoshgudimetla) [SIG Node, Testing and Windows]
    • Kubernetes is now built with Golang 1.16.7. (#104199, @cpanato)

    • Kubernetes is now built with Golang 1.17.1. (#104904, @cpanato)

    • Kubernetes is now built with Golang 1.17.2 (#105563, @mengjiao-liu)

    • Kubernetes is now built with Golang 1.17.3 (#106209, @cpanato) [SIG API Machinery, Cloud Provider, Instrumentation, Release and Testing]

    • Move ConfigurableFSGroupPolicy to GA and rename metric volume_fsgroup_recursive_apply to volume_apply_access_control (#105885, @gnufied)

    • Move the GetAllocatableResources Endpoint in PodResource API to the beta that will make it enabled by default. (#105003, @swatisehgal)

    • Moving WindowsHostProcessContainers feature to beta (#106058, @marosset)

    • Node affinity, Node selectors, and tolerations are now mutable for Jobs that are suspended and have never been started (#105479, @ahg-g)

    • Pod template annotations and labels are now mutable for Jobs that are suspended and have never been started (#105980, @ahg-g)

    • PodSecurity: in 1.23+ restricted policy levels, Pods and containers which set runAsUser=0 are forbidden at admission-time; previously, they would be rejected at runtime (#105857, @liggitt)

    • Shell completion now knows to continue suggesting resource names when the command supports it. For example kubectl get pod pod1 <TAB> will suggest more Pod names. (#105711, @marckhouzam)

    • Support to enable Hyper-V in GCE Windows Nodes created with kube-up (#105999, @mauriciopoppe)

    • The CPUManager policy options are now enabled, and we introduce a graduation path for the new CPU Manager policy options. (#105012, @fromanirh)

    • The Pods and Pod controllers that are exempted from the PodSecurity admission process are now marked with the pod-security.kubernetes.io/exempt: user/namespace/runtimeClass annotation, based on what caused the exemption.

      The enforcement level that allowed or denied a Pod during PodSecurity admission is now marked by the pod-security.kubernetes.io/enforce-policy annotation.

      The annotation that informs about audit policy violations changed from pod-security.kubernetes.io/audit to pod-security.kubernetes.io/audit-violation. (#105908, @stlaz)

    • The /openapi/v3 endpoint will be populated with OpenAPI v3 if the feature flag is enabled (#105945, @Jefftree)

    • The CSIMigrationGCE feature flag is turned ON by default (#104722, @leiyiz)

    • The DownwardAPIHugePages feature is now enabled by default. (#106271, @mysunshine92)

    • The PodSecurity admission plugin has graduated to beta and is enabled by default. The admission configuration version has been promoted to pod-security.admission.config.k8s.io/v1beta1. See https://kubernetes.io/docs/concepts/security/pod-security-admission/ for usage guidelines. (#106089, @liggitt)

    • The ServiceAccountIssuerDiscovery feature gate is removed. It reached GA in Kubernetes 1.21. (#103685, @mengjiao-liu)

    • The constants/variables from k8s.io for STABLE metrics is now supported. (#103654, @coffeepac)

    • The kubectl describe namespace now shows Conditions (#106219, @dlipovetsky)

    • The etcd container image now supports Windows. (#92433, @claudiubelu)

    • The kube-apiserver’s Prometheus metrics have been extended with some that describe the costs of handling LIST requests. They are as follows.

      • apiserver_cache_list_total: Counter of LIST requests served from watch cache, broken down by resource_prefix and index_name
      • apiserver_cache_list_fetched_objects_total: Counter of objects read from watch cache in the course of serving a LIST request, broken down by resource_prefix and index_name
      • apiserver_cache_list_evaluated_objects_total: Counter of objects tested in the course of serving a LIST request from watch cache, broken down by resource_prefix
      • apiserver_cache_list_returned_objects_total: Counter of objects returned for a LIST request from watch cache, broken down by resource_prefix
      • apiserver_storage_list_total: Counter of LIST requests served from etcd, broken down by resource
      • apiserver_storage_list_fetched_objects_total: Counter of objects read from etcd in the course of serving a LIST request, broken down by resource
      • apiserver_storage_list_evaluated_objects_total: Counter of objects tested in the course of serving a LIST request from etcd, broken down by resource
      • apiserver_storage_list_returned_objects_total: Counter of objects returned for a LIST request from etcd, broken down by resource (#104983, @MikeSpreitzer)
    • The pause image list now contains Windows Server 2022. (#104438, @nick5616)

    • The script kube-up.sh installs csi-proxy v1.0.1-gke.0. (#104426, @mauriciopoppe)

    • This PR adds the following metrics for API Priority and Fairness.

      • apiserver_flowcontrol_priority_level_seat_count_samples: histograms of seats occupied by executing requests (both regular and final-delay phases included), broken down by priority_level; the observations are taken once per millisecond.
      • apiserver_flowcontrol_priority_level_seat_count_watermarks: histograms of high and low watermarks of number of seats occupied by executing requests (both regular and final-delay phases included), broken down by priority_level.
      • apiserver_flowcontrol_watch_count_samples: histograms of number of watches relevant to a given mutating request, broken down by that request’s priority_level and flow_schema. (#105873, @MikeSpreitzer) [SIG API Machinery, Instrumentation and Testing]
    • Topology Aware Hints have graduated to beta. (#106433, @robscott) [SIG Network]

    • Turn on CSIMigrationAzureDisk by default on 1.23 (#104670, @andyzhangx)

    • Update the system-validators library to v1.6.0 (#106323, @neolit123) [SIG Cluster Lifecycle and Node]

    • Updated Cluster Autosaler to version 1.22.0. Release notes: https://github.com/kubernetes/autoscaler/releases/tag/cluster-autoscaler-1.22.0. (#104293, @x13n)

    • Updates debian-iptables to v1.6.7 to pick up CVE fixes. (#104970, @PushkarJ)

    • Updates the following images to pick up CVE fixes:

    • Upgrade etcd to 3.5.1 (#105706, @uthark) [SIG Cloud Provider, Cluster Lifecycle and Testing]

    • When feature gate JobTrackingWithFinalizers is enabled:

      • Limit the number of Pods tracked in a single Job sync to avoid starvation of small Jobs.
      • The metric job_pod_finished_total counts the number of finished Pods tracked by the Job controller. (#105197, @alculquicondor)
    • When using RequestedToCapacityRatio ScoringStrategy, empty shape will cause error. (#106169, @kerthcet) [SIG Scheduling]

    • client-go event library allows customizing spam filtering function. It is now possible to override SpamKeyFunc, which is used by event filtering to detect spam in the events. (#103918, @olagacek)

    • client-go, using log level 9, traces the following events of a HTTP request:

      • DNS lookup
      • TCP dialing
      • TLS handshake
      • Time to get a connection from the pool
      • Time to process a request (#105156, @aojea)
    • Allow KUBE_TEST_REPO_LIST to be a remote url (#109512, @eddiezane) [SIG Cloud Provider and Testing]

    • Kubernetes is now built with Golang 1.17.11 (#110423, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Kube-apiserver: when merging lists, Server Side Apply now prefers the order of the submitted request instead of the existing persisted object (#107567, @jiahuif) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Storage and Testing]

    Documentation

    • Graduating pod_scheduling_duration_seconds, pod_scheduling_attempts, framework_extension_point_duration_seconds, plugin_execution_duration_seconds and queue_incoming_pods_total metrics to stable. (#106266, @ahg-g) [SIG Instrumentation, Scheduling and Testing]
    • The test “[sig-network] EndpointSlice should have Endpoints and EndpointSlices pointing to API Server [Conformance]” only requires that there is an EndpointSlice that references the “kubernetes.default” service, it no longer requires that its named “kubernetes”. (#104664, @aojea)
    • Update description of --audit-log-maxbackup to describe behavior when value = 0. (#103843, @Arkessler)
    • Users should not rely on unsupported CRON_TZ variable when specifying schedule, both the API server and cronjob controller will emit warnings pointing to https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/ containing explanation (#106455, @soltysh) [SIG Apps]

    Failing Test

    • Fixes hostPath storage E2E tests within SELinux enabled env (#104551, @Elbehery)
    • Allow KUBE_TEST_REPO_LIST to be a remote url (#109512, @eddiezane) [SIG Cloud Provider and Testing]

    Bug or Regression

    • (PodSecurity admission) errors validating workload resources (deployment, replicaset, etc.) no longer block admission. (#106017, @tallclair) [SIG Auth]
    • A pod that the Kubelet rejects was still considered as being accepted for a brief period of time after rejection, which might cause some pods to be rejected briefly that could fit on the node. A pod that is still terminating (but has status indicating it has failed) may also still be consuming resources and so should also be considered. (#104817, @smarterclayton)
    • Add Kubernetes Events to the Kubelet Graceful Shutdown feature. (#101081, @rphillips)
    • Add Pod Security admission metrics: pod_security_evaluations_total, pod_security_exemptions_total, pod_security_errors_total (#105898, @tallclair)
    • Add support for Windows Network stats in Containerd (#105744, @jsturtevant) [SIG Node, Testing and Windows]
    • Added show-capacity option to kubectl top node to show Capacity resource usage (#102917, @bysnupy) [SIG CLI]
    • Apimachinery: Pretty printed JSON and YAML output is now indented consistently. (#105466, @liggitt)
    • Be able to create a Pod with Generic Ephemeral Volumes as raw block devices. (#105682, @pohly)
    • CA, certificate and key bundles for the generic-apiserver based servers will be reloaded immediately after the files are changed. (#104102, @tnqn)
    • Change kubectl diff --invalid-arg status code from 1 to 2 to match docs (#105445, @ardaguclu)
    • Changed kubectl describe to compute age of an event using the EventSeries.count and EventSeries.lastObservedTime. (#104482, @harjas27)
    • Changes behaviour of kube-proxy start; does not attempt to set specific sysctl values (which does not work in recent Kernel versions anymore in non-init namespaces), when the current sysctl values are already set higher. (#103174, @Napsty)
    • Client-go uses the same HTTP client for all the generated groups and versions, allowing to share customized transports for multiple groups versions. (#105490, @aojea)
    • Disable aufs module for gce clusters. (#103831, @lizhuqi)
    • Do not unmount and mount subpath bind mounts during container creation unless bind mount changes (#105512, @gnufied) [SIG Storage]
    • Don’t prematurely close reflectors in case of slow initialization in watch based manager to fix issues with inability to properly mount secrets/configmaps. (#104604, @wojtek-t)
    • Don’t use a custom dialer for the kubelet if is not rotating certificates, so we can reuse TCP connections and have only one between the apiserver and the kubelet. If users experiment problems with stale connections using HTTP1.1, they can force the previous behavior of the kubelet by setting the environment variable DISABLE_HTTP2. (#104844, @aojea) [SIG API Machinery, Auth and Node]
    • EndpointSlice Mirroring controller now cleans up managed EndpointSlices when a Service selector is added (#105997, @robscott) [SIG Apps, Network and Testing]
    • Enhanced event messages for pod failed for exec probe timeout (#106201, @yxxhero) [SIG Node]
    • Ensure Pods are removed from the scheduler cache when the scheduler misses deletion events due to transient errors (#106102, @alculquicondor) [SIG Scheduling]
    • Ensure InstanceShutdownByProviderID return false for creating Azure VMs. (#104382, @feiskyer)
    • Evicted and other terminated Pods will no longer revert to the Running phase. (#105462, @ehashman)
    • Fix kube-apiserver metric reporting for the deprecated watch path of /api/<version>/watch/.... (#104161, @wojtek-t)
    • Fix a regression where the Kubelet failed to exclude already completed pods from calculations about how many resources it was currently using when deciding whether to allow more pods. (#104577, @smarterclayton)
    • Fix detach disk issue on deleting vmss node. (#104572, @andyzhangx)
    • Fix job controller syncs: In case of conflicts, ensure that the sync happens with the most up to date information. Improves reliability of JobTrackingWithFinalizers. (#105214, @alculquicondor)
    • Fix job tracking with finalizers for more than 500 pods, ensuring all finalizers are removed before counting the Pod. (#104666, @alculquicondor)
    • Fix pod name of NonIndexed Jobs to not include rogue -1 substring (#105676, @alculquicondor)
    • Fix scoring for NodeResourcesBalancedAllocation plugins when nodes have containers with no requests. (#105845, @ahmad-diaa)
    • Fix system default topology spreading when nodes don’t have zone labels. Pods correctly spread by default now. (#105046, @alculquicondor)
    • Fix: do not try to delete a LoadBalancer that does not exist (#105777, @nilo19)
    • Fix: ignore non-VMSS error for VMAS nodes in reconcileBackendPools. (#103997, @nilo19)
    • Fix: leave the probe path empty for TCP probes (#105253, @nilo19)
    • Fix: remove VMSS and VMSS instances from SLB backend pool only when necessary (#105839, @nilo19)
    • Fix: skip instance not found when decoupling VMSSs from LB (#105666, @nilo19)
    • Fix: skip case sensitivity when checking Azure NSG rules. (#104384, @feiskyer)
    • Fixed a bug that prevents a PersistentVolume that has a PersistentVolumeClaim UID which doesn’t exist in local cache but exists in etcd from being updated to the Released phase. (#105211, @xiaopingrubyist)
    • Fixed a bug where using kubectl patch with $deleteFromPrimitiveList on a nonexistent or empty list would add the item to the list (#105421, @brianpursley)
    • Fixed a bug which could cause webhooks to have an incorrect copy of the old object after an Apply or Update (#106195, @alexzielenski) [SIG API Machinery]
    • Fixed a bug which kubectl would emit duplicate warning messages for flag names that contain an underscore and recommend using a nonexistent flag in some cases. (#103852, @brianpursley)
    • Fixed a panic in kubectl when creating secrets with an improper output type (#106317, @lauchokyip)
    • Fixed a regression setting --audit-log-path=- to log to stdout in 1.22 pre-release. (#103875, @andrewrynhard)
    • Fixed an issue which didn’t append OS’s environment variables with the one provided in Credential Provider Config file, which may fail execution of external credential provider binary. See https://github.com/kubernetes/kubernetes/issues/102750. (#103231, @n4j)
    • Fixed applying of SELinux labels to CSI volumes on very busy systems (with “error checking for SELinux support: could not get consistent content of /proc/self/mountinfo after 3 attempts”) (#105934, @jsafrane) [SIG Storage]
    • Fixed architecture within manifest for non amd64 etcd images. (#104116, @saschagrunert)
    • Fixed architecture within manifest for non amd64 etcd images. (#105484, @saschagrunert)
    • Fixed azure disk translation issue due to lower case managed kind. (#103439, @andyzhangx)
    • Fixed client IP preservation for NodePort service with protocol SCTP in ipvs. (#104756, @tnqn)
    • Fixed concurrent map access causing panics when logging timed-out API calls. (#105734, @marseel)
    • Fixed consolidate logs for instance not found error
    • Fixed skip not found nodes when reconciling LB backend address pools (#105188, @nilo19)
    • Fixed occasional pod cgroup freeze when using cgroup v1 and systemd driver. (#104528, @kolyshkin)
    • Fixed the issue where logging output of kube-scheduler configuration files included line breaks and escape characters. The output also attempted to output the configuration file in one section without showing the user a more readable format. (#106228, @sanchayanghosh) [SIG Scheduling]
    • Fixes a bug that could result in the EndpointSlice controller unnecessarily updating EndpointSlices associated with a Service that had Topology Aware Hints enabled. (#105267, @llhuii)
    • Fixes a regression that could cause panics in LRU caches in controller-manager, kubelet, kube-apiserver, or client-go. (#104466, @stbenjam)
    • Fixes an issue where an admission webhook can observe a v1 Pod object that does not have the defaultMode field set in the injected service account token volume in kube-api-server. (#104523, @liggitt)
    • Fixes the should support building a client with a CSR E2E test to work with clusters configured with short certificate lifetimes (#105396, @liggitt)
    • Graceful node shutdown, allow the actual inhibit delay to be greater than the expected inhibit delay. (#103137, @wzshiming)
    • Handle Generic Ephemeral Volumes properly in the node limits scheduler filter and the kubelet hostPath check. (#100482, @pohly)
    • Headless Services with no selector which were created without dual-stack enabled will be defaulted to RequireDualStack instead of PreferDualStack. This is consistent with such Services which are created with dual-stack enabled. (#104986, @thockin)
    • Ignore not a vmss instance error for VMAS nodes in EnsureBackendPoolDeleted. (#105185, @ialidzhikov)
    • Ignore the case when comparing azure tags in service annotation. (#104705, @nilo19)
    • Ignore the case when updating Azure tags. (#104593, @nilo19)
    • Introduce a new server run option ‘shutdown-send-retry-after’. If true the HTTP Server will continue listening until all non longrunning request(s) in flight have been drained, during this window all incoming requests will be rejected with a status code 429 and a ‘Retry-After’ response header. (#101257, @tkashem)
    • Kube-apiserver: Avoid unnecessary repeated calls to admission webhooks that reject an update or delete request. (#104182, @liggitt)
    • Kube-apiserver: Server Side Apply merge order is reverted to match v1.22 behavior until http://issue.k8s.io/104641 is resolved. (#106661, @liggitt)
    • Kube-apiserver: events created via the events.k8s.io API group for cluster-scoped objects are now permitted in the default namespace as well for compatibility with events clients and the v1 API (#100125, @h4ghhh)
    • Kube-apiserver: fix a memory leak when deleting multiple objects with a deletecollection. (#105606, @sxllwx)
    • Kube-proxy health check ports used to listen to :<port> for each of the services. This is not needed and opens ports in addresses the cluster user may not have intended. The PR limits listening to all node address which are controlled by --nodeport-addresses flag. if no addresses are provided then we default to existing behavior by listening to :<port> for each service (#104742, @khenidak)
    • Kube-proxy: delete stale conntrack UDP entries for loadbalancer ingress IP. (#104009, @aojea)
    • Kube-scheduler now doesn’t print any usage message when unknown flag is specified. (#104503, @sanposhiho)
    • Kube-up now includes CoreDNS version v1.8.6 (#106091, @rajansandeep) [SIG Cloud Provider]
    • Kubeadm: When adding an etcd peer to an existing cluster, if an error is returned indicating the peer has already been added, this is accepted and a ListMembers call is used instead to return the existing cluster. This helps to diminish the exponential backoff when the first AddMember call times out, while still retaining a similar performance when the peer has already been added from a previous call. (#104134, @ihgann)
    • Kubeadm: do not allow empty --config paths to be passed to kubeadm kubeconfig user (#105649, @navist2020)
    • Kubeadm: fix a bug on Windows worker nodes, where the downloaded KubeletConfiguration from the cluster can contain Linux paths that do not work on Windows and can trip the kubelet binary. (#105992, @hwdef) [SIG Cluster Lifecycle and Windows]
    • Kubeadm: switch the preflight check (called ‘Swap’) that verifies if swap is enabled on Linux hosts to report a warning instead of an error. This is related to the graduation of the NodeSwap feature gate in the kubelet to Beta and being enabled by default in 1.23 - allows swap support on Linux hosts. In the next release of kubeadm (1.24) the preflight check will be removed, thus we recommend that you stop using it - e.g. via --ignore-preflight-errors or the kubeadm config. (#104854, @pacoxu)
    • Kubelet did not report kubelet_volume_stats_* metrics for Generic Ephemeral Volumes. (#105569, @pohly)
    • Kubelet’s Node Grace Shutdown will terminate probes when shutting down (#105215, @rphillips)
    • Kubelet: fixes a file descriptor leak in log rotation (#106382, @rphillips) [SIG Node]
    • Kubelet: the printing of flags at the start of kubelet now uses the final logging configuration. (#106520, @pohly)
    • Make the etcd client (used by the API server) retry certain types of errors. The full list of retriable (codes.Unavailable) errors can be found at https://github.com/etcd-io/etcd/blob/main/api/v3rpc/rpctypes/error.go#L72 (#105069, @p0lyn0mial)
    • Metrics changes: Fix exposed buckets of scheduler_volume_scheduling_duration_seconds_bucket metric. (#100720, @dntosas)
    • Migrated kubernetes object references (= name + namespace) to structured logging when using JSON as log output format (#104877, @pohly)
    • Pass additional flags to subpath mount to avoid flakes in certain conditions. (#104253, @mauriciopoppe)
    • Pod SecurityContext sysctls name parameter for update requests where the existing object’s sysctl contains slashes and kubelet sysctl whitelist support contains slashes. (#102393, @mengjiao-liu) [SIG Apps, Auth, Node, Storage and Testing]
    • Pod will not start when Init container was OOM killed. (#104650, @yxxhero) [SIG Node]
    • PodResources interface was changed, now it returns only isolated CPUs (#97415, @AlexeyPerevalov)
    • Provide IPv6 support for internal load balancer. (#103794, @nilo19)
    • Reduce the number of calls to docker for stats via dockershim. For Windows this reduces the latency when calling docker, for Linux this saves cpu cycles. (#104287, @jsturtevant) [SIG Node and Windows]
    • Removed the error message label from the kubelet_started_pods_errors_total metric (#105213, @yxxhero)
    • Resolves a potential issue with GC and NS controllers which may delete objects after getting a 404 response from the server during its startup. This PR ensures that requests to aggregated APIs will get 503, not 404 while the APIServiceRegistrationController hasn’t finished its job. (#104748, @p0lyn0mial)
    • Respect grace period when updating static pods. (#104743, @gjkim42) [SIG Node and Testing]
    • Revert building binaries with PIE mode. (#105352, @ehashman)
    • Reverts adding namespace label to admission metrics (and histogram exansion) due to cardinality issues. (#104033, @s-urbaniak)
    • Reverts the CRI API version surfaced by dockershim to v1alpha2. (#106808, @saschagrunert)
    • Scheduler resource metrics over fractional binary quantities (2.5Gi, 1.1Ki) were incorrectly reported as very small values. (#103751, @y-tag)
    • Support more than 100 disk mounts on Windows (#105673, @andyzhangx)
    • Support using negative array index in JSON patch replace operations. (#105896, @zqzten)
    • The --leader-elect* CLI args are now honored in scheduler. (#105915, @Huang-Wei)
    • The --leader-elect* CLI args are now honored in the scheduler. (#105712, @Huang-Wei)
    • The client-go dynamic client sets the header Content-Type: application/json by default (#104327, @sxllwx)
    • The kube-Proxy now correctly filters out unready endpoints for Services with Topology. (#106507, @robscott)
    • The pods/binding subresource now honors metadata.uid and metadata.resourceVersion (#105913, @aholic)
    • The kube-proxy sync_proxy_rules_iptables_total metric now gives the correct number of rules, rather than being off by one. Fixed multiple iptables proxy regressions introduced in 1.22:
      • When using Services with SessionAffinity, client affinity for an endpoint now gets broken when that endpoint becomes non-ready (rather than continuing until the endpoint is fully deleted).
      • Traffic to a service IP now starts getting rejected (as opposed to merely dropped) as soon as there are no longer any usable endpoints, rather than waiting until all of the terminating endpoints have terminated even when those terminating endpoints were not being used.
      • Chains for endpoints that won’t be used are no longer output to iptables, saving a bit of memory/time/cpu. (#106030, @danwinship) [SIG Network]
    • Topology Aware Hints now ignores unready endpoints when assigning hints. (#106510, @robscott)
    • Topology Hints now excludes control plane notes from capacity calculations. (#104744, @robscott)
    • Update Go used to build migrate script in etcd image to v1.16.7. (#104301, @serathius)
    • Updated json representation for a conflicted taint to Key=Effect when a conflicted taint occurs in kubectl taint. (#104011, @manugupt1)
    • Upgrades functionality of kubectl kustomize as described at https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv4.4.1 (#106389, @natasha41575) [SIG CLI]
    • Watch requests that are delegated to aggregated API servers no longer reserve concurrency units (seats) in the API Priority and Fairness dispatcher for their entire duration. (#105511, @benluddy)
    • When a static pod file is deleted and recreated while using a fixed UID, the pod was not properly restarted. (#104847, @smarterclayton)
    • XFS-filesystems are now force-formatted (option -f) in order to avoid problems being formatted due to detection of magic super-blocks. This aligns with the behaviour of formatting of ext3/4 filesystems. (#104923, @davidkarlsen)
    • --log-flush-frequency had no effect in several commands or was missing. Help and warning texts were not always using the right format for a command (add_dir_header instead of add-dir-header). Fixing this included cleaning up flag handling in component-base/logs: that package no longer adds flags to the global flag sets. Commands which want the klog and --log-flush-frequency flags must explicitly call logs.AddFlags; the new cli.Run does that for commands. That helper function also covers flag normalization and printing of usage and errors in a consistent way (print usage text first if parsing failed, then the error). (#105076, @pohly)
    • Kubeadm: allow the “certs check-expiration” command to not require the existence of the cluster CA key (ca.key file) when checking the expiration of managed certificates in kubeconfig files. (#106931, @neolit123) [SIG Cluster Lifecycle]
    • Kubeadm: during execution of the “check expiration” command, treat the etcd CA as external if there is a missing etcd CA key file (etcd/ca.key) and perform the proper validation on certificates signed by the etcd CA. Additionally, make sure that the CA for all entries in the output table is included - for both certificates on disk and in kubeconfig files. (#106926, @neolit123) [SIG Cluster Lifecycle]
    • Kubectl: restores --dry-run, --dry-run=true, and --dry-run=false for compatibility with pre-1.23 invocations. (#107021, @liggitt) [SIG CLI and Testing]
    • Reverts graceful node shutdown to match 1.21 behavior of setting pods that have not yet successfully completed to “Failed” phase if the GracefulNodeShutdown feature is enabled in kubelet. The GracefulNodeShutdown feature is beta and must be explicitly configured via kubelet config to be enabled in 1.21+. This changes 1.22 and 1.23 behavior on node shutdown to match 1.21. If you do not want pods to be marked terminated on node shutdown in 1.22 and 1.23, disable the GracefulNodeShutdown feature. (#106900, @bobbypage) [SIG Node and Testing]
    • An inefficient lock in EndpointSlice controller metrics cache has been reworked. Network programming latency may be significantly reduced in certain scenarios, especially in clusters with a large number of Services. (#107167, @robscott) [SIG Apps and Network]
    • Client-go: fix that paged list calls with ResourceVersionMatch set would fail once paging kicked in. (#107334, @fasaxc) [SIG API Machinery]
    • Fix a panic when using invalid output format in kubectl create secret command (#107347, @rikatz) [SIG CLI]
    • Fix: azuredisk parameter lowercase translation issue (#107429, @andyzhangx) [SIG Cloud Provider and Storage]
    • Fixed a bug that a pod’s .status.nominatedNodeName is not cleared properly, and thus over-occupied system resources. (#107109, @Huang-Wei) [SIG Scheduling and Testing]
    • Fixes a rare race condition handling requests that timeout (#107458, @liggitt) [SIG API Machinery]
    • Mount-utils: Detect potential stale file handle (#106988, @andyzhangx) [SIG Storage]
    • The feature gate was mentioned as csiMigrationRBD where it should have been CSIMigrationRBD to be in parity with other migration plugins. This release correct the same and keep it as CSIMigrationRBD. Users who have configured this feature gate as csiMigrationRBD has to reconfigure the same to CSIMigrationRBD from this release. (#107554, @humblec) [SIG Storage]
    • Fix: delete non existing Azure disk issue (#107406, @andyzhangx) [SIG Cloud Provider]
    • Fixes a regression in 1.23 that incorrectly pruned data from array items of a custom resource that set x-kubernetes-preserve-unknown-fields: true (#107689, @liggitt) [SIG API Machinery]
    • Fix Azurefile volumeid collision issue in csi migration (#107575, @andyzhangx) [SIG Cloud Provider and Storage]
    • Fix e2e test “Services should respect internalTrafficPolicy=Local Pod and Node, to Pod (hostNetwork: true)” (#107902, @xueqzhan) [SIG Network and Testing]
    • Fixes a regression in 1.23 where update requests to previously persisted Service objects that have not been modified since 1.19 can be rejected with an incorrect spec.clusterIPs: Required value error (#107875, @liggitt) [SIG Network and Testing]
    • Fixes static pod add and removes restarts in certain cases. (#107761, @rphillips) [SIG Node]
    • Bump sigs.k8s.io/apiserver-network-proxy/konnectivity-client to v0.0.30, fixing goroutine leaks in kube-apiserver. (#108438, @andrewsykim) [SIG API Machinery, Auth and Cloud Provider]
    • Fix kubectl config flags incorrectly setting burst and discovery limits (#108401, @ulucinar) [SIG CLI]
    • Fix static pod restarts in cases where the container is not present. (#108164, @rphillips) [SIG Node]
    • Fixes a bug where a partial EndpointSlice update could cause node name information to be dropped from endpoints that were not updated. (#108201, @robscott) [SIG Network]
    • Fixes a regression in the kubelet restarting static pods. (#107931, @rphillips) [SIG Node and Testing]
    • Fixes error handling in a kubectl method used in downstream packages. (#107938, @heybronson) [SIG CLI]
    • Increase Azure ACR credential provider timeout (#108209, @andyzhangx) [SIG Cloud Provider]
    • Kube-apiserver: removed apf_fd from server logs (added in 1.23.0) which could contain data identifying the requesting user (#108634, @jupblb) [SIG API Machinery and Scalability]
    • Bug: client-go clientset was not defaulting the user agent, using the default golang agent for all the requests. (#108791, @aojea) [SIG API Machinery and Instrumentation]
    • E2e tests wait for kube-root-ca.crt to be populated in namespaces for use with projected service account tokens, reducing delays starting those test pods and errors in the logs. (#108860, @eddiezane) [SIG Testing]
    • Failure to start a container cannot accidentally result in the pod being considered “Succeeded” in the presence of deletion. (#108882, @rphillips) [SIG Node]
    • Fix indexer bug that resulted in incorrect index updates if number of index values for a given object was changing during update (#109137, @wojtek-t) [SIG API Machinery]
    • Fix the overestimated cost of delegated API requests in kube-apiserver API priority&fairness (#109216, @wojtek-t) [SIG API Machinery]
    • Fixed a regression that could incorrectly reject pods with OutOfCpu errors if they were rapidly scheduled after other pods were reported as complete in the API. The Kubelet now waits to report the phase of a pod as terminal in the API until all running containers are guaranteed to have stopped and no new containers can be started. Short-lived pods may take slightly longer (~1s) to report Succeeded or Failed after this change. (#108723, @bobbypage) [SIG Apps, Node and Testing]
    • Correct event registration for multiple scheduler plugins; this fixes a potential significant delay in re-queueing unschedulable pods. (#109446, @ahg-g) [SIG Scheduling and Testing]
    • Existing InTree AzureFile PVs which don’t have a secret namespace defined will now work properly after enabling CSI migration - the namespace will be obtained from ClaimRef. (#108000, @RomanBednar) [SIG Cloud Provider and Storage]
    • Fix JobTrackingWithFinalizers that:
      • was declaring a job finished before counting all the created pods in the status
      • was leaving pods with finalizers, blocking pod and job deletions
      • JobTrackingWithFinalizers is still disabled by default. (#109486, @alculquicondor) [SIG Apps and Testing]
    • Fix a bug that out-of-tree plugin is misplaced when using scheduler v1beta3 config (#108890, @Huang-Wei) [SIG Scheduling]
    • Fix kubectl completion zsh to use any command name rather than hardcoded kubectl (#109235, @soltysh) [SIG CLI]
    • Kubeadm: add the flag “–experimental-initial-corrupt-check” to etcd static Pod manifests to ensure etcd member data consistency (#109075, @neolit123) [SIG Cluster Lifecycle]
    • EndpointSlices marked for deletion are now ignored during reconciliation. (#110483, @aryan9600) [SIG Apps and Network]
    • Fixed a kubelet issue that could result in invalid pod status updates to be sent to the api-server where pods would be reported in a terminal phase but also report a ready condition of true in some cases. (#110480, @bobbypage) [SIG Node and Testing]
    • Pods will now post their readiness during termination. (#110417, @aojea) [SIG Network, Node and Testing]
    • Fix a bug that caused the wrong result length when using –chunk-size and –selector together (#110757, @Abirdcfly) [SIG API Machinery and Testing]
    • Fix bug that prevented the job controller from enforcing activeDeadlineSeconds when set (#110545, @harshanarayana) [SIG Apps]
    • Fix image pulling failure when IMDS is unavailable in kubelet startup (#110523, @andyzhangx) [SIG Cloud Provider]
    • Fix printing resources with int64 fields (#110602, @sanchezl) [SIG API Machinery]
    • Fixed a regression introduced in 1.23.0 where Azure load balancers were not kept up to date with the state of cluster nodes. In particular, nodes that are not in the ready state and are not newly created (i.e. not having the node.cloudprovider.kubernetes.io/uninitialized taint) now get removed from Azure load balancers. (#109932, @ricky-rav) [SIG Cloud Provider]
    • Fixed potential scheduler crash when scheduling with unsatisfied nodes in PodTopologySpread. (#110853, @kerthcet) [SIG Scheduling]
    • Kubeadm: fix the bug that configurable KubernetesVersion not respected during kubeadm join (#111022, @SataQiu) [SIG Cluster Lifecycle]
    • Reduced time taken to sync proxy rules on Windows kube-proxy with kernelspace mode (#110702, @daschott) [SIG Network and Windows]
    • Updated cAdvisor to v0.43.1 to pick up a kubelet fix where network metrics can be missing in some cases when used with containerd (#111013, @bobbypage) [SIG Node]

    Other (Cleanup or Flake)

    • All klog flags except for -v and -vmodule are deprecated. Support for -vmodule is only guaranteed for the text log format. (#105042, @pohly)
    • Better pod events (“waiting for ephemeral volume controller to create the persistentvolumeclaim”" instead of “persistentvolumeclaim not found”) when using generic ephemeral volumes. (#104605, @pohly)
    • Changed buckets in apiserver_request_duration_seconds metric from [0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0,1.25, 1.5, 1.75, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60] to [0.05, 0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 1.25, 1.5, 2, 3, 4, 5, 6, 8, 10, 15, 20, 30, 45, 60] (#106306, @pawbana) [SIG API Machinery, Instrumentation and Testing]
    • Deprecate apiserver_longrunning_gauge and apiserver_register_watchers in 1.23.0. (#103793, @yan-lgtm)
    • Enhanced error message for nodes not selected by scheduler due to pod’s PersistentVolumeClaim(s) bound to PersistentVolume(s) that do not exist. (#105196, @yibozhuang)
    • Fix an issue in cleaning up CertificateSigningRequest objects with an unparseable status.certificate field. (#103823, @liggitt)
    • Kube-apiserver: requests to node, Service, and Pod /proxy subresources with no additional URL path now only automatically redirect GET and HEAD requests. (#95128, @Riaankl)
    • Kube-apiserver: sets an upper-bound on the lifetime of idle keep-alive connections and the time to read the headers of incoming requests. (#103958, @liggitt)
    • Kubeadm: external etcd endpoints passed in the ClusterConfiguration that have Unicode characters are no longer IDNA encoded (converted to Punycode). They are now just URL encoded as per Go’s implementation of RFC-3986, have duplicate “/” removed from the URL paths, and passed like that directly to the kube-apiserver --etcd-servers flag. If you have etcd endpoints that have Unicode characters, it is advisable to encode them in advance with tooling that is fully IDNA compliant. If you don’t do that, the Go standard library (used in k8s and etcd) would do it for you when making requests to the endpoints. (#103801, @gkarthiks)
    • Kubeadm: remove the --port flag from the manifest for the kube-controller-manager since the flag has been a NO-OP since 1.22 and insecure serving was removed for the component. (#104157, @knight42)
    • Kubeadm: remove the --port flag from the manifest for the kube-scheduler since the flag has been a NO-OP since 1.23 and insecure serving was removed for the component. (#105034, @pacoxu)
    • Kubeadm: update references to legacy artifacts locations, the ci-cross prefix has been removed from the version match as it does not exist in the new gs://k8s-release-dev bucket. (#103813, @SataQiu)
    • Kubectl: deprecated command line flags (like several of the klog flags) now have a DEPRECATED: <explanation> comment. (#106172, @pohly) [SIG CLI]
    • Kubemark is now built as a portable, static binary. (#106150, @pohly) [SIG Scalability and Testing]
    • Migrate cmd/proxy/{config, healthcheck, winkernel} to structured logging (#104944, @jyz0309)
    • Migrate pkg/proxy to structured logging (#104908, @CIPHERTron)
    • Migrate pkg/scheduler/framework/plugins/interpodaffinity/filtering.go,pkg/scheduler/framework/plugins/podtopologyspread/filtering.go, pkg/scheduler/framework/plugins/volumezone/volume_zone.go to structured logging (#105931, @mengjiao-liu)
    • Migrate pkg/scheduler to structured logging. (#99273, @yangjunmyfm192085)
    • Migrate cmd/proxy/app and pkg/proxy/meta_proxier to structured logging (#104928, @jyz0309)
    • Migrated cmd/kube-scheduler/app/server.go, pkg/scheduler/framework/plugins/nodelabel/node_label.go, pkg/scheduler/framework/plugins/nodevolumelimits/csi.go, pkg/scheduler/framework/plugins/nodevolumelimits/non_csi.go to structured logging (#105855, @shivanshu1333)
    • Migrated pkg/proxy/ipvs to structured logging (#104932, @shivanshu1333)
    • Migrated pkg/proxy/userspace to structured logging. (#104931, @shivanshu1333)
    • Migrated pkg/proxy to structured logging (#104891, @shivanshu1333)
    • Migrated pkg/scheduler/framework/plugins/volumebinding/assume_cache.go to structured logging. (#105904, @mengjiao-liu) [SIG Instrumentation, Scheduling and Storage]
    • Migrated pkg/scheduler/framework/preemption/preemption.go, pkg/scheduler/framework/plugins/examples/stateful/stateful.go, and pkg/scheduler/framework/plugins/noderesources/resource_allocation.go to structured logging (#105967, @shivanshu1333) [SIG Instrumentation, Node and Scheduling]
    • Migrated pkg/proxy/winuserspace to structured logging (#105035, @shivanshu1333)
    • Migrated scheduler file cache.go to structured logging (#105969, @shivanshu1333) [SIG Instrumentation and Scheduling]
    • Migrated scheduler files comparer.go, dumper.go, node_tree.go to structured logging (#105968, @shivanshu1333) [SIG Instrumentation and Scheduling]
    • More detailed logging has been added to the EndpointSlice controller for Topology. (#104741, @robscott)
    • Remove deprecated and not supported old cronjob controller. (#106126, @soltysh) [SIG Apps]
    • Remove ignore error flag for drain, and set this feature as default (#105571, @yuzhiquan) [SIG CLI]
    • Remove the deprecated flags --csr-only and --csr-dir from kubeadm certs renew. Please use kubeadm certs generate-csr instead. (#104796, @RA489)
    • Support allocating whole NUMA nodes in the CPUManager when there is not a 1:1 mapping between socket and NUMA node (#102015, @klueska)
    • Support for Windows Server 2022 was added to the k8s.gcr.io/pause:3.6 image. (#104711, @claudiubelu)
    • Surface warning when users don’t set propagationPolicy for jobs while deleting. (#104080, @ravisantoshgudimetla)
    • The AllowInsecureBackendProxy feature gate is removed. It reached GA in Kubernetes 1.21. (#103796, @mengjiao-liu)
    • The BoundServiceAccountTokenVolume feature gate that is GA since v1.22 is unconditionally enabled, and can no longer be specified via the --feature-gates argument. (#104167, @ialidzhikov)
    • The StartupProbe feature gate that is GA since v1.20 is unconditionally enabled, and can no longer be specified via the --feature-gates argument. (#104168, @ialidzhikov)
    • The SupportPodPidsLimit and SupportNodePidsLimit feature gates that are GA since v1.20 are unconditionally enabled, and can no longer be specified via the --feature-gates argument. (#104163, @ialidzhikov)
    • The apiserver exposes 4 new metrics that allow to track the status of the Service CIDRs allocations:
      • current number of available IPs per Service CIDR
      • current number of used IPs per Service CIDR
      • total number of allocation per Service CIDR
      • total number of allocation errors per ServiceCIDR (#104119, @aojea)
    • The flag --deployment-controller-sync-period has been deprecated and will be removed in v1.24. (#103538, @Pingan2017)
    • The image gcr.io/kubernetes-e2e-test-images will no longer be used in E2E / CI testing, k8s.gcr.io/e2e-test-images will be used instead. (#103724, @claudiubelu)
    • The kube-proxy image contains /go-runner as a replacement for deprecated klog flags. (#106301, @pohly)
    • The maximum length of the CSINode id field has increased to 256 bytes to match the CSI spec. (#104160, @pacoxu)
    • Troubleshooting: informers log handlers that take more than 100 milliseconds to process an object if the DeltaFIFO queue starts to grow beyond 10 elements. (#103917, @aojea)
    • Update cri-tools dependency to v1.22.0. (#104430, @saschagrunert)
    • Update migratecmd/kube-proxy/app logs to structured logging. (#98913, @yxxhero)
    • Update build images to Debian 11 (Bullseye)
      • debian-base:bullseye-v1.0.0
      • debian-iptables:bullseye-v1.0.0
      • go-runner:v2.3.1-go1.17.1-bullseye.0
      • kube-cross:v1.23.0-go1.17.1-bullseye.1
      • setcap:bullseye-v1.0.0
      • cluster/images/etcd: Build 3.5.0-2 image
      • test/conformance/image: Update runner image to base-debian11 (#105158, @justaugustus)
    • Update conformance image to use debian-base:buster-v1.9.0. (#104696, @PushkarJ)
    • volume.kubernetes.io/storage-provisioner annotation will be added to dynamic provisioning required PVC. volume.beta.kubernetes.io/storage-provisioner annotation is deprecated. (#104590, @Jiawei0227)
    • Updates konnectivity-network-proxy to v0.0.27. This includes a memory leak fix for the network proxy (#107037, @jdnurme) [SIG API Machinery, Auth and Cloud Provider]

    Dependencies

    Added

    • bazil.org/fuse: 371fbbd
    • github.com/OneOfOne/xxhash: v1.2.2
    • github.com/antlr/antlr4/runtime/Go/antlr: b48c857
    • github.com/cespare/xxhash: v1.1.0
    • github.com/cncf/xds/go: fbca930
    • github.com/getkin/kin-openapi: v0.76.0
    • github.com/go-logr/zapr: v1.2.0
    • github.com/google/cel-go: v0.9.0
    • github.com/google/cel-spec: v0.6.0
    • github.com/google/martian/v3: v3.1.0
    • github.com/kr/fs: v0.1.0
    • github.com/pkg/sftp: v1.10.1
    • github.com/spaolacci/murmur3: f09979e
    • sigs.k8s.io/json: c049b76

    Changed

    Removed

  • This is a security release featuring the latest version of Flatcar Container Linux (3139.2.3), Kubernetes (1.22.11), and all of Giant Swarm applications. It also enables auditd monitoring of execve syscalls to ease audit logging.

    Change details

    azure-operator 5.22.0

    Changed

    • Tighten pod and container security contexts for PSS restricted policies.

    Fixed

    • Fix handling of MachinePools' status fields for empty node pools.

    Changed

    • Bump k8scc to enable auditd monitoring for execve syscalls.

    containerlinux 3139.2.3

    New Stable Release 3139.2.3

    Changes since Stable 3139.2.2

    Security fixes:

    Updates:

    kubernetes 1.22.11

    Bug or Regression

    • Bug Fix: Kube-proxy dropped endpointSlice’s local endpoints when upgrading from 1.20 to 1.22 (#110245, @xh4n3) [SIG Network]
    • EndpointSlices marked for deletion are now ignored during reconciliation. (#110482, @aryan9600) [SIG Apps and Network]
    • Fixed a kubelet issue that could result in invalid pod status updates to be sent to the api-server where pods would be reported in a terminal phase but also report a ready condition of true in some cases. (#110481, @bobbypage) [SIG Node and Testing]
    • Pods will now post their readiness during termination. (#110418, @aojea) [SIG Network, Node and Testing]

    Dependencies

    Added

    Nothing has changed.

    Changed

    Nothing has changed.

    Removed

    Nothing has changed.

    external-dns 2.15.0

    Changed

    • Update test dependencies and py-helm-charts version to 0.7.0 (#173)
    • Ignore IRSA annotation for service account when using AWS external access.

    chart-operator 2.24.1

    Changed

    • Update helmclient to v4.10.1.

    node-exporter 1.13.0

    Changed

    • Disable boot partition from the filesystem exporter.

    cluster-autoscaler 1.22.2-gs7

    Changed

    • Enable balance similar nodepools by default

    Fixed

    • Ignore labels to consider nodepools similar groups

    Added

    • Support to add extra arguments

    coredns 1.10.0

    Added

    • Add app.kubernetes.io/component on deployments so that management-cluster-admission controller does not complain.

    kube-state-metrics 1.11.0

    Add

    • Allow application.giantswarm.io/team label.
  • This maintenance Azure workload cluster release provides the latest Kubernetes 1.22 version as well as the latest version of all Giant Swarm components. It also uses etcd 3.5 for improved performance and reliability.

    Highlights

    • Automation of the Kubernetes API key used to encrypt secret data in etcd;
    • Kubernetes v1.22.10;
    • etcd 3.5.4.

    How does the Kubernetes API key rotation work?

    • The rotation is disabled by default and has to be enabled by setting the encryption.giantswarm.io/enable-rotation annotation on the ${CLUSTER-ID}-encryption-provider-config secret;
    • The key rotation happens if the key is at least 180 days old (counting from creation timestamp on ${CLUSTER-ID}-encryption-provider-config secret or from last key rotation). It can also be forced by setting the encryption.giantswarm.io/force-rotation annotation to start the rotation process immediately;
    • A new config is generated containing the new and old keys as soon as the process starts;
    • The next step requires a roll of the control plane nodes (either manually or during an update);
    • After the control plane nodes have been rolled and are using the new encryption configuration, the encryption-provider-operator will rewrite all secrets. This leads to the re-encryption of all secrets with the new key;
    • The operator will remove the old encryption key after all the secrets are rewritten.

    Change details

    app-operator 6.0.1

    Upgraded from 5.10.2.

    This upgrade brings a bunch of bug fixes, including one to better handle the “app bundle” use case (used for example by the Giant Swarm security stack).

    For detailed changelog please refer to the changelog.

    azure-operator 5.21.0

    Upgraded from 5.17.0.

    This upgrade brings a lot of improvement in the bootstrap and upgrade processes and it fixes a few minor bugs.

    It also includes a migration process for MachinePool and AzureMachinPool CRs from the old experimental api group to the new stable one.

    For detailed changelog please refer to the changelog.

    cluster-operator 4.3.0

    Upgraded from 3.12.0.

    This upgrade brings a few bug fixes as well as support for automated rotation of Secret encryption keys.

    For detailed changelog please refer to the changelog.

    cert-operator 2.0.1

    Upgraded from 1.3.0.

    This upgrade brings updated dependencies in the golang binary to address security issues in earlier version.

    For detailed changelog please refer to the changelog.

    containerlinux 3139.2.2

    Upgraded from 3033.2.2.

    This upgrade brings fixes for a lot of security issues in all main operating system components, including Linux, golang, containerd and openssl.

    Please refer to the official changelog for all details.

    calico 3.21.5

    Upgraded from 3.12.3.

    This upgrade bring security and bug fixes.

    Please refer to the official changelog for all details.

    etcd 3.5.4

    Upgraded from 3.4.18.

    This is a minor release bump, bringing several security and bug fixes and important performance improvements.

    Please refer to the official changelog and the announcement blog post for all details.

    kubernetes 1.22.10

    Upgraded from 1.22.6.

    This release brings bug and security fixes.

    Please refer to the official changelog for all the details.

    cert-exporter 2.2.0

    Upgraded from 2.0.1.

    This release brings improved reliability to the exporter.

    Please refer to the changelog for all the details.

    chart-operator 2.24.0

    Upgraded from 2.20.1.

    This release brings improved reliability to the operator.

    Please refer to the changelog for all the details.

    coredns 1.10.1

    Upgraded from 1.8.0.

    This upgrade provides coredns version 1.8.7 as well as improvements in the helm chart.

    Please refer to the changelog for all the details.

    external-dns 2.14.0

    Upgraded from 2.9.0.

    This upgrade provides external-dns version 0.11.0 as well as improvements in the helm chart.

    Please refer to the changelog for all the details.

    cluster-autoscaler 1.22.2-gs6

    Upgraded from 1.22.2-gs4.

    This upgrade provides an improved helm chart.

    Please refer to the changelog for all the details.

    metrics-server 1.7.0

    Upgraded from 1.5.0.

    This upgrade provides metrics-server version 0.5.2 as well as improvements in the helm chart.

    Please refer to the changelog for all the details.

    net-exporter 1.12.0

    Upgraded from 1.11.0.

    This upgrade provides an improved helm chart.

    Please refer to the changelog for all the details.

    node-exporter 1.12.0

    Upgraded from 1.8.0.

    This upgrade provides node-exporter 1.3.1 and enables the diskstats exporter to expose node IO metrics.

    Please refer to the changelog for all the details.

    azure-scheduled-events 0.7.0

    Upgraded from 0.6.1.

    This upgrade provides an improved helm chart.

    Please refer to the changelog for all the details.

    vertical-pod-autoscaler 2.4.0

    Upgraded from 2.1.1.

    This upgrade provides vertical-pod-autoscaler 0.10.0 including an internal fix to circumvent a bug in containerd that prevented VPA to work properly.

    Please refer to the changelog for all the details.

    vertical-pod-autoscaler-crd 1.0.1

    Upgraded from 1.0.0.

    Added

    • Add cluster singleton restriction so app can only be installed once.

    kube-state-metrics 1.10.0

    Upgraded from 1.7.0.

    This upgrade provides an improved helm chart.

    Please refer to the changelog for all the details.

  • This release provides security hardening of app-operator to tighten RBAC permissions as well as honoring write contexts more accurately.

    Change details

    app-operator 5.10.2

    Fixed

    • Add missing permissions for apps/deployments.
  • This release provides security hardening of app-operator to tighten RBAC permissions as well as honoring write contexts more accurately.

    Change details

    app-operator 5.10.2

    Fixed

    • Add missing permissions for apps/deployments.
  • This release provides support for Kubernetes 1.22, has Control Groups v2 enabled by default and includes the Vertical Pod autoscaler.

    Highlights

    • Kubernetes 1.22 support;
    • Control Groups v2 are enabled by default;
    • rpcbind is disabled by default to mitigate security risks. NFS v2 and v3 are not supported anymore;
    • Security fixes:
      • 44 Linux CVEs;
      • 10 expat;
      • 8 Go CVEs;
      • 5 glibc CVE;
      • 4 Docker CVEs;
      • 3 curl CVEs;
      • 3 vim CVEs;
      • 2 polkit CVE;
      • 2 bash CVEs;
      • 2 binutils CVEs;
      • 3 containerd CVEs;
      • 2 nettle CVEs;
      • 2 SDK: bison CVEs;
      • 1 ca-certificates CVE;
      • 1 util-linux CVE;
      • 1 git CVE;
      • 1 gnupg CVE;
      • 1 libgcrypt CVE;
      • 1 sssd CVE;
      • 1 SDK: perl CVE;

    Warning: Kubernetes v1.22 removed certain APIs and features. More details are available in the upstream blog post.

    Warning: rpcbind is disabled by default to mitigate security risks. Any application which requires it will no longer work. NFS v2 and v3 are such applications and are no longer supported. Please, check if any you have any application which depend on rpcbind before you upgrade.

    Known Issues

    • Java applications are unable to identify memory limits when using a JRE prior to v15 in a Control Groups v2 environment. Support was added in JRE v15 and later. More details are available in the upstream issue. We recommend using the latest LTS JRE available (currently v17) to ensure continued compatibility with future releases;
    • Go applications, which use an older version of uber-go/automaxprocs, are unable to properly set GOMAXPROCS. Such applications need to be updated to use at least v1.5.1 of uber-go/automaxprocs.

    Control Groups v1 To ensure a smooth transition, in case you need time to modify applications to make them compatible with Control Groups v2, we provide a mechanism that will allow using Control Groups v1 on specific node pools. More details are available in our documentation.

    Change details

    kubernetes 1.22.6

    What’s New (Major Themes)

    Removal of several beta Kubernetes APIs

    A number of APIs are no longer serving specific Beta versions in favour of the GA version of those APIs. All existing objects can be interacted with via general availability APIs. This removal includes beta versions of ValidatingWebhookConfiguration, MutatingWebhookConfiguration, CustomResourceDefinition, APIService, TokenReview, SubjectAccessReview, CertificateSigningRequest, Lease, Ingress, and IngressClass APIs. For the full list check out Deprecated API Migration Guide and the blog post Kubernetes API and Feature Removals In 1.22: Here’s What You Need To Know.

    Kubernetes release cadence change

    We all have to adapt to change in our lives, and especially so in the past year. The Kubernetes release team was also affected from the COVID-19 pandemic and has listened to its user base regarding the number of releases in a calendar year. From April 23, 2021 it was made official that Kubernetes release cadence has reduced from 4 releases per year to 3 releases per year.

    You can read more in the official blog post Kubernetes Release Cadence Change: Here’s What You Need To Know.

    External credential providers

    Kubernetes client credential plugins have been in beta since 1.11, a few eons ago. With the release of Kubernetes 1.22, this feature set graduates to stable. The GA feature set includes improved support for plugins that provide interactive login flows. This release also contains a number of bug fixes to the feature set. Aspiring plugin authors can look at sample-exec-plugin as a way to get started.

    Related to this topic, the in-tree Azure and GCP authentication plugins have been deprecated in favor of out-of-tree implementations.

    Server-side Apply graduates to GA

    Server-side Apply is a new object merge algorithm, as well as tracking of field ownership, running on the Kubernetes API server. Server-side Apply helps users and controllers manage their resources via declarative configurations. It allows them to create and/or modify their objects declaratively, simply by sending their fully specified intent. After being in beta for a couple releases, Server-side Apply is now generally available.

    Cluster Storage Interface graduations

    CSI support for Windows nodes moves to GA in the 1.22 release. In Kubernetes v1.22, Windows privileged containers are only an alpha feature. To allow using CSI storage on Windows nodes, CSIProxy enables CSI node plugins to be deployed as unprivileged pods, using the proxy to perform privileged storage operations on the node.

    Another feature moving to GA in v1.22 is CSI Service Account Token support. This feature allows CSI drivers to use pods’ bound service account tokens instead of a more privileged identity. It also provides control over to re-publishing these volumes, so that short-lived tokens can be refreshed.

    SIG Windows development tools

    To grow the developer community, SIG Windows released multiple tools. The new tools support multiple CNI providers (Antrea, Calico), can run on multiple platforms (any vagrant compatible provider, such as Hyper-V, VirtualBox, or vSphere). There is also a new way to run bleeding edge Windows features from scratch by compiling the windows kubelet and kube-proxy, then using them along with daily builds of other Kubernetes components.

    Deploy a more secure control plane with kubeadm

    A new alpha feature allows running the kubeadm control plane components as non-root users. This is a long requested security measure in kubeadm. To try it you must enable the kubeadm-specific RootlessControlPlane feature gate. When you deploy a cluster using this alpha feature, your control plane runs with lower privileges.

    A new v1beta3 configuration API. It iterates over v1beta2 by adding some long requested features and deprecating some existing ones. The V1beta3 is now the preferred API version; the v1beta2 API also remains available and is not yet deprecated.

    etcd moves to version 3.5.0

    Kubernetes’ default backend storage, etcd, has a new release 3.5.0 and the community embraced it. The new release comes with improvements to the Security, performance, monitoring and developer experience. There are numerous bug fixes to lease objects causing memory leaks, and compact operation causing deadlocks and more. A couple of new features are also introduced like the migration to structured logging and build in log rotation. The release comes with a detailed future roadmap to implement a solution to traffic overload. A full and detailed list of changes can be read in the 3.5.0 release announcement.

    Kubernetes Node system swap support

    Every system administrator or Kubernetes user has been in the same boat regarding setting up and using Kubernetes: disable swap space. With the release of Kubernetes 1.22, alpha support is available to run nodes with swap memory. This change lets administrators opt in to configuring swap on Linux nodes, treating a portion of block storage as additional virtual memory.

    Cluster-wide seccomp defaults

    A new alpha feature gate SeccompDefault has been added to the kubelet, together with a corresponding command line flag --seccomp-default and kubelet configuration. If both are enabled, then the kubelet’s behavior changes for pods that don’t explicitly set a seccomp profile. With cluster-wide seccomp defaults, the kubelet uses the RuntimeDefault seccomp profile by default, rather than than Unconfined. This allows enhancing the default cluster wide workload security of the Kubernetes deployment. Security administrators will now sleep better knowing there is some security by default for the workloads.

    To learn more about the feature, please refer to the official seccomp tutorial.

    Quality of Service for memory resources

    Originally, Kubernetes used the v1 cgroups API. With that design, the QoS class for a pod only applied to CPU resources (such as cpu_shares). The Kubernetes cgroup manager uses memory.limit_in_bytes in v1 cgroups to limit the memory capacity for a container, and uses oom_scores to recommend an order for killing container processes if an out-of-memory event occurs. This implementation has shortcomings: for Guaranteed pods, memory can not be fully reserved, and the page cache is at risk of being recycled. For Burstable pods, overcommitting memory (setting request less than limit ) could increase the risk of a container being killed when the Linux kernel detects an out of memory condition.

    As an alpha feature, Kubernetes v1.22 can use the cgroups v2 API to control memory allocation and isolation. This feature is designed to improve workload and node availability when there is contention for memory resources.

    API changes and improvements for ephemeral containers

    The API used to create Ephemeral Containers changed in 1.22. The Ephemeral Containers feature is alpha and disabled by default, and the new API does not work with clients that attempt to use the old API.

    For stable features, the kubectl tool follows the Kubernetes version skew policy; however, kubectl v1.21 and older do not support the new API for ephemeral containers. Users who create ephemeral containers using kubectl debug should note that kubectl version 1.22 will attempt to fall back to the old API; older versions of kubectl will not work with cluster versions of 1.22 or later. Please update kubectl to 1.22 if you wish to use kubectl debug with a mix of cluster versions.

    Known Issues

    CPU and Memory manager are not working correctly for Guaranteed Pods with multiple containers

    A regression bug was found where guaranteed Pods with multiple containers do not work properly with set allocations for CPU, Memory, and Device manager. The fix will be availability in coming releases.

    CSIMigrationvSphere feature gate has not migrated to new CRD APIs

    If CSIMigrationvSphere feature gate is enabled, user should not upgrade to Kubernetes v1.22. vSphere CSI Driver does not support Kubernetes v1.22 yet because it uses v1beta1 CRD APIs. Support for v1.22 will be added at a later release. Check the following document for supported Kubernetes releases for a given vSphere CSI Driver version.

    Urgent Upgrade Notes

    (No, really, you MUST read this before you upgrade)
    • Audit log files are now created with a mode of 0600. Existing file permissions will not be changed. If you need the audit file to be readable by a non-root user, you can pre-create the file with the desired permissions. (#95387, @JAORMX) [SIG API Machinery and Auth]
    • CSI migration of AWS EBS volumes requires AWS EBS CSI driver ver. 1.0 that supports allowAutoIOPSPerGBIncrease parameter in StorageClass. (#101082, @jsafrane)
    • Conformance image is now built with Distroless. Users running Conformance testing should rely on container entrypoint instead of manual invocation to /run_e2e.sh or /gorunner, as they are now deprecated and will be removed in 1.25 release. Invoking ginkgo and e2e.test are still supported through overriding entrypoint (docker) or defining container spec.command (kubernetes). (#99178, @wilsonehusin)
    • Default StreamingProxyRedirects to disabled. If there is a >= 2 version skew between master and nodes, and the old nodes were enabling --redirect-container-streaming, this will break them. In this case, the StreamingProxyRedirects can still be manually enabled. (#101647, @pacoxu)
    • Intree volume plugin scaleIO support has been completely removed from Kubernetes. (#101685, @Jiawei0227)
    • Kubeadm: remove the automatic detection and matching of cgroup drivers for Docker. For new clusters if you have not configured the cgroup driver explicitly you might get a failure in the kubelet on driver mismatch (kubeadm clusters should be using the systemd driver). Also remove the IsDockerSystemdCheck preflight check (warning) that checks if the Docker cgroup driver is set to systemd. Ideally such detection / coordination should be on the side of CRI implementers and the kubelet (tracked here). Please see the page on how to configure cgroup drivers with kubeadm manually (#99647, @neolit123)
    • Kubeadm: the flag --cri-socket is no longer allowed in a mixture with the flag --config. Please use the kubeadm configuration for setting the CRI socket for a node using {Init|Join}Configuration.nodeRegistration.criSocket. (#101600, @KofClubs)
    • Newly provisioned PVs by Azure disk will no longer have the beta FailureDomain label. Azure disk volume plugin will start to have GA topology label instead. (#101534, @kassarl)
    • Scheduler’s CycleState now embeds internal read/write locking inside its Read() and Write() functions. Meanwhile, Lock() and Unlock() function are removed. Scheduler plugin developers are now required to remove CycleState#Lock() and CycleState#Unlock(). Just simply use Read() and Write() as they’re natively thread-safe now. (#101542, @Huang-Wei)
    • The CSIMigrationVSphereComplete feature flag is removed. InTreePluginvSphereUnregister will be the way moving forward. (#101272, @Jiawei0227)
    • The flag --experimental-patches is now deprecated and will be removed in a future release. You can migrate to using the new flag --patches. Add a new field {Init|Join}Configuration.patches.directory that can be used for the same purpose. For init and join it is now recommended that you migrate to configure patches via {Init|Join}Configuration.patches.directory. For the time being, these flags can be mixed with --config, but that might change in the future. On a command line, the last *patches flag takes precedence over previous flags and the value in config. kubeadm upgrade --patches will continue to be the only available option, since upgrade does not support a configuration file yet. (#103063, @neolit123)

    Important Security Information

    This release contains changes that address the following vulnerabilities:

    A security issue was discovered in Kubernetes where a user may be able to create a container with subpath volume mounts to access files & directories outside of the volume, including on the host filesystem.

    Affected Versions:

    • kubelet v1.22.0 - v1.22.1
    • kubelet v1.21.0 - v1.21.4
    • kubelet v1.20.0 - v1.20.10
    • kubelet <= v1.19.14

    Fixed Versions:

    • kubelet v1.22.2
    • kubelet v1.21.5
    • kubelet v1.20.11
    • kubelet v1.19.15

    This vulnerability was reported by Fabricio Voznika and Mark Wolters of Google.

    CVSS Rating: High (8.8) CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

    Deprecation

    • Controller-manager: the following flags have no effect and would be removed in v1.24:

      • --port
      • --address The insecure port flags --port may only be set to 0 now.

      In addtion, please be careful that:

      • controller-manager MUST start with --authorization-kubeconfig and --authentication-kubeconfig correctly set to get authentication/authorization working.
      • liveness/readiness probes to controller-manager MUST use HTTPS now, and the default port has been changed to 10257.
      • Applications that fetch metrics from controller-manager should use a dedicated service account which is allowed to access nonResourceURLs /metrics. (#96216, @knight42) [SIG API Machinery, Cloud Provider, Instrumentation and Testing]
    • Deprecate --record flag in kubectl. The --record flag is being replaced with the mechanism which annotates HTTP requests with kubectl command details. (#102873, @soltysh)

    • E2e.test: removed the --viper-config flag. If you were previously using this to pass flags to e2e.test via a file, you will need to pass them directly on the command line, e.g. e2e.test --e2e-output-dir. (#102598, @dims)

    • For kubeadm: remove the ClusterStatus API from v1beta3 and its management in the kube-system/kubeadm-config ConfigMap. This method of keeping track of what API endpoints exists in the cluster was replaced (in a prior release) by a method to annotate the etcd Pods that kubeadm creates in “stacked etcd” clusters. The following CLI sub-phases are deprecated and are now a NO-OP: for kubeadm join: “control-plane-join/update-status”, for kubeadm reset: “update-cluster-status”. Unless you are using these phases explicitly, you should not be affected. (#101915, @neolit123)

    • Kubeadm: remove the deprecated --csr-only and --csr-dir flags from kubeadm init phase certs. Deprecate the same flags under kubeadm certs renew. In both the cases the command kubeadm certs generate-csr should be used instead. (#102108, @neolit123)

    • Kubeadm: Remove the deprecated command kubeadm alpha kubeconfig. Please use kubeadm kubeconfig instead. (#101938, @knight42)

    • Kubeadm: Remove the deprecated hyperkube image support in v1beta3. This implies removal of ClusterConfiguration.UseHyperKubeImage. (#101537, @neolit123)

    • Kubeadm: Remove the field ClusterConfiguration.DNS.Type in v1beta3 since CoreDNS is the only supported DNS type. (#101547, @neolit123)

    • Kubeadm: remove the deprecated command kubeadm config view. A replacement for this command is kubectl get cm -n kube-system kubeadm-config -o=jsonpath="{.data.ClusterConfiguration}" (#102071, @neolit123)

    • Kubeadm: remove the deprecated flag ‘–image-pull-timeout’ for ‘kubeadm upgrade apply’ command (#102093, @SataQiu) [SIG Cluster Lifecycle]

    • Kubeadm: remove the deprecated flag --insecure-port from the kube-apiserver manifest that kubeadm manages. The flag had no effect since 1.20, since the insecure serving of the component was disabled in the same version. (#102121, @pacoxu)

    • Kubeadm: remove the deprecated kubeadm API v1beta1. Introduce a new kubeadm API v1beta3. See kubeadm/v1beta3 for a list of changes since v1beta2. Note that v1beta2 is not yet deprecated, but will be in a future release. (#101129, @neolit123)

    • Newly provisioned PVs by vSphere in-tree plugin will no longer have the beta FailureDomain label. vSphere volume plugin will start to have GA topology label (#102414, @divyenpatel)

    • Removal of the CSI NodePublish path by the kubelet is deprecated. This must be done by the CSI plugin according to the CSI spec. (#101441, @dobsonj)

    • Remove support for the Service topologyKeys field (alpha) and the kube-proxy implementation of it. This field was deprecated several cycles ago. This functionality is replaced by the combination of automatic topology hints per-endpoint (alpha) and the Service internalTrafficPolicy field (alpha). (#102412, @andrewsykim)

    • The PodUnknown phase is now deprecated. (#95286, @SergeyKanzhelev)

    • The storageos, quobyte and flocker storage volume plugins are deprecated and will be removed in a later release. (#101773, @Jiawei0227)

    • The deprecated flag --hard-pod-affinity-symmetric-weight and --scheduler-name have been removed from kube-scheduler. Use ComponentConfig instead to configure those parameters. (#102805, @ahg-g)

    • The feature Dynamic Kubelet Configuration is deprecated and kubelet will report warning when the flag --dynamic-config-dir is used. Feature gate DynamicKubeletConfig is disabled out of the box and needs to be explicitly enabled. (#102966, @SergeyKanzhelev) [SIG Cloud Provider, Instrumentation and Node]

    • The in-tree azure and gcp auth plugins have been deprecated. The https://github.com/Azure/kubelogin and gcloud commands serve as out-of-tree replacements via the kubectl/client-go credential plugin mechanism. (#102181, @enj) [SIG API Machinery and Auth]

    • The ingress v1beta1 has been deprecated. (#102030, @aojea)

    API Change

    • A new score extension for NodeResourcesFit plugin that merges the functionality of NodeResourcesLeastAllocated, NodeResourcesMostAllocated, RequestedToCapacityRatio plugins, which are marked as deprecated as of v1beta2. In v1beta1, the three plugins can still be used in v1beta1 but not at the same time with the score extension of NodeResourcesFit. (#101822, @yuzhiquan)

    • A value of Auto is now a valid for the service.kubernetes.io/topology-aware-hints annotation. (#100728, @robscott)

    • Add DataSourceRef alpha field to PVC spec, which allows contents other than PVCs and VolumeSnapshots to be data sources. (#103276, @bswartz)

    • Add PersistentVolumeClaimDeletePoilcy to StatefulSet API. (#99378, @mattcary)

    • Add a new Priority and Fairness rule that exempts all probes (/readyz, /healthz, /livez) to prevent restarting of healthy kube-apiserver instance by kubelet. (#100678, @tkashem)

    • Add alpha support for HostProcess containers on Windows (#99576, @marosset) [SIG API Machinery, Apps, Node, Testing and Windows]

    • Add distributed tracing to the kube-apiserver. It is can be enabled with the feature gate APIServerTracing (#94942, @dashpole)

    • Add three metrics to the job controller to monitor if a job works in healthy condition. IndexedJob has been promoted to Beta. (#101292, @AliceZhang2016)

    • Added field .status.uncountedTerminatedPods to the Job resource. This field is used by the job controller to keep track of finished pods before adding them to the Job status counters. Pods created by the job controller get the finalizer batch.kubernetes.io/job-tracking Jobs that are tracked using this mechanism get the annotation batch.kubernetes.io/job-tracking. This is a temporary measure. Two releases after this feature graduates to beta, the annotation won’t be added to Jobs anymore. (#98817, @alculquicondor)

    • Added new kubelet alpha feature SeccompDefault. This feature enables falling back to the RuntimeDefault (former runtime/default) seccomp profile if nothing else is specified in the pod/container SecurityContext or the pod annotation level. To use the feature, enable the feature gate as well as set the kubelet configuration option SeccompDefault (--seccomp-default) to true. (#101943, @saschagrunert) [SIG Node]

    • Adds the ReadWriteOncePod access mode for PersistentVolumes and PersistentVolumeClaims. Restricts volume access to a single pod on a single node. (#102028, @chrishenzie)

    • Alpha swap support can now be enabled on Kubernetes nodes with the NodeSwapEnabled feature flag. See KEP-2400 for details. (#102823, @ehashman)

    • Because of the implementation logic of time.Format in golang, the displayed time zone is not consistent. (#102366, @cndoit18)

    • Corrected the documentation for escaping dollar signs in a container’s env, command and args property. (#101916, @MartinKanters) [SIG Apps]

    • Enable MaxSurge for DaemonSet by default. (#101742, @ravisantoshgudimetla)

    • Enforce the ReadWriteOncePod PVC access mode during scheduling (#103082, @chrishenzie)

    • Ephemeral containers are now allowed to configure a securityContext that differs from that of the Pod. Cluster administrators should ensure that security policy controllers support EphemeralContainers before enabling this feature in clusters. (#99023, @verb)

    • Exec plugin authors can override default handling of standard input via new interactiveMode kubeconfig field. (#99310, @ankeesler)

    • If someone had the ProbeTerminationGracePeriod alpha feature enabled in 1.21, they should update/delete any workloads/pods with probe terminationGracePeriods < 1 before upgrading (#103245, @wzshiming)

    • Improved parsing of label selectors (#102188, @alculquicondor) [SIG API Machinery]

    • Introduce minReadySeconds api to the StatefulSets. (#100842, @ravisantoshgudimetla)

    • Introducing Memory quality of service support with cgroups v2 (Alpha). The MemoryQoS feature is now in Alpha. This allows kubelet running with cgroups v2 to set memory QoS at container, pod and QoS level to protect and guarantee better memory quality. This feature can be enabled through feature gate Memory QoS. (#102970, @borgerli)

    • Kube API server accepts Impersonate-Uid header to impersonate a user with a specific UID, in the same way that you can currently use Impersonate-User, Impersonate-Group and Impersonate-Extra. (#99961, @margocrawf)

    • Kube-apiserver: --service-account-issuer can be specified multiple times now, to enable non-disruptive change of issuer. (#101155, @zshihang) [SIG API Machinery, Auth, Node and Testing]

    • Kube-controller-manager: the --horizontal-pod-autoscaler-use-rest-clients flag and Heapster support in the horizontal pod autoscaler, deprecated since 1.12, is removed. (#90368, @serathius)

    • Kube-scheduler: a plugin enabled in a v1beta2 configuration file takes precedence over the default configuration for that plugin. This simplifies enabling default plugins with custom configuration without needing to explicitly disable those default plugins. (#99582, @chendave)

    • New node-high priority-level has been added to Suggested API Priority and Fairness configuration.(#101151, @mborsz)

    • NodeSwapEnabled feature flag was renamed to NodeSwap

      The flag was only available in the 1.22.0-beta.1 release, and the new flag should be used going forward. (#103553, @ehashman) [SIG Node]

    • Omit comparison with boolean constant (#101523, @chuntaochen) [SIG CLI and Cloud Provider]

    • Removed the feature flag for probe-level termination grace period from Kubelet. If a user wants to disable this feature on already created pods, they will have to delete and recreate the pods. (#103168, @raisaat) [SIG Apps and Node]

    • Revert addition of Add PersistentVolumeClaimDeletePoilcy to StatefulSetAPI. (#103747, @mattcary)

    • Scheduler could be configured to consider new resources beside CPU and memory, GPU for example, for the score plugin of NodeResourcesBalancedAllocation. (#101946, @chendave) [SIG Scheduling]

    • Server Side Apply now treats all Selector fields as atomic (meaning the entire selector is managed by a single writer and updated together), since they contain interrelated and inseparable fields that do not merge in intuitive ways. (#97989, @Danil-Grigorev) [SIG API Machinery]

    • Suspend Job feature graduated to beta. Added the action label to Job controller sync metrics job_sync_total and job_sync_duration_seconds. (#102022, @adtac)

    • The API documentation for the DaemonSet’s spec.updateStrategy.rollingUpdate.maxUnavailable field was corrected to state that the value is rounded up. (#101296, @Miciah)

    • The CSIServiceAccountToken graduates to Ga and is unconditionally enabled. (#103001, @zshihang)

    • The CertificateSigningRequest.certificates.k8s.io API supports an optional expirationSeconds field to allow the client to request a particular duration for the issued certificate. The default signer implementations provided by the Kubernetes controller manager will honor this field as long as it does not exceed the –cluster-signing-duration flag. (#99494, @enj)

    • The EndpointSlicen Mirroring controller no longer mirrors the last-applied-configuration annotation created by kubectl to update EndpointSlices. (#102731, @sharmarajdaksh)

    • The NetworkPolicyEndPort is graduated to beta and is enabled by default. (#102834, @rikatz)

    • The PodDeletionCost feature has been promoted to beta, and enabled by default. (#101080, @ahg-g)

    • The Server Side Apply treats certain structs as atomic. Meaning the entire selector field is managed by a single writer and updated together. (#100684, @Jefftree)

    • The ServiceAppProtocol feature gate has been removed. It reached GA in Kubernetes (#103190, @robscott)

    • The TerminationGracePeriodSeconds on pod specs and container probes should not be negative. Negative values of TerminationGracePeriodSeconds will be treated as the value 1s on the delete path. Immutable field validation will be relaxed in order to update negative values. In a future release, negative values will not be permitted. (#98866, @wzshiming)

    • The kube-scheduler component config v1beta2 API available Three scheduler plugins deprecated (NodeLabel, ServiceAffinity, NodePreferAvoidPods). (#99597, @adtac)

    • The pod/eviction subresource now accepts policy/v1 eviction requests in addition to policy/v1beta1 eviction requests (#100724, @liggitt)

    • The podAffinity, NamespaceSelector and the associated CrossNamespaceAffinity quota scope features graduate to Beta and they are now enabled by default. (#101496, @ahg-g)

    • The pods/ephemeralcontainers API now returns and expects a Pod object instead of EphemeralContainers. This is incompatible with the previous alpha-level API. (#101034, @verb) [SIG Apps, Auth, CLI and Testing]

    • The v1.Node and .status.images[].names are now optional. (#102159, @roycaihw)

    • The deprecated flag --algorithm-provider has been removed from kube-scheduler. Use instead ComponentConfig to configure the set of enabled plugins. (#102239, @Haleygo)

    • The options --ssh-user and --ssh-key are removed. They only functioned on GCE, and only in-tree. Use the apiserver network proxy instead. (#102297, @deads2k)

    • Track Job completion through status and Pod finalizers, removing dependency on Pod tombstones. (#98238, @alculquicondor) [SIG API Machinery, Apps, Auth and Testing]

    • Track ownership of scale subresource for all scalable resources i.e. Deployment, ReplicaSet, StatefulSet, ReplicationController, and Custom Resources. (#98377, @nodo) [SIG API Machinery and Testing]

    Feature

    • Kube-apiserver: when merging lists, Server Side Apply now prefers the order of the submitted request instead of the existing persisted object (#107568, @jiahuif) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Storage and Testing]

    • Kubernetes is now built with Golang 1.16.12 (#106982, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Update golang.org/x/net to v0.0.0-20211209124913-491a49abca63 (#106960, @cpanato) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Node and Storage]

    • Kubernetes is now built with Golang 1.16.10 (#106223, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Update debian-base, debian-iptables, setcap images to pick up CVE fixes

      • Debian-base to v1.9.0
      • Debian-iptables to v1.6.7
      • setcap to v2.0.4 (#106143, @cpanato) [SIG Release and Testing]
    • A system-cluster-critical pod should not get a low OOM Score.

      As of now both system-node-critical and system-cluster-critical pods have -997 OOM score, making them one of the last processes to be OOMKilled. By definition system-cluster-critical pods can be scheduled elsewhere if there is a resource crunch on the node where as system-node-critical pods cannot be rescheduled. This was the reason for system-node-critical to have higher priority value than system-cluster-critical. This change allows only system-node-critical priority class to have low OOMScore.

      action required If the user wants to have the pod to be OOMKilled last and the pod has system-cluster-critical priority class, it has to be changed to system-node-critical priority class to preserve the existing behavior (#99729, @ravisantoshgudimetla)

    • API Server tracing can now trace re-entrant api requests. (#103218, @dashpole) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle and Instrumentation]

    • APIServerTracing now collects spans from etcd client calls, and propagates context to etcd. (#103216, @dashpole) [SIG API Machinery, Cloud Provider and Instrumentation]

    • APIServerTracing now collects spans from outgoing requests to admission webhooks. (#103601, @dashpole) [SIG API Machinery]

    • Add a namespace label for all apiserver_admission_* metrics. Expand the histogram range to 0-10s for all apiserver_admission_*_duration_seconds metrics. (#101208, @voutcn)

    • Add unified map on CRI to support cgroup v2. Refer to https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md#unified. (#102578, @payall4u)

    • Added BinaryData description to kubectl describe command. (#100568, @lauchokyip)

    • Added a new metric apiserver_flowcontrol_request_concurrency_in_use that shows the number of seats (concurrency) occupied by the currently executing requests in the API Priority and Fairness system. (#102795, @tkashem)

    • Added field-selector option for kubectl top pod (#102155, @lauchokyip) [SIG CLI]

    • Added new metrics about API Priority and Fairness. Each one has a label priority_level. The last two also have a label bound taking values min and `max.

      • apiserver_flowcontrol_current_r: R(the time of the last change in state of the queues)
      • apiserver_flowcontrol_dispatch_r: R(the time of the latest request dispatch)
      • apiserver_flowcontrol_latest_s: S(the request last dispatched) = R(when that request starts executing in the virtual world)
      • apiserver_flowcontrol_next_s_bounds: min and max next S among non-empty queues
      • apiserver_flowcontrol_next_discounted_s_bounds: min and max next S - (sum [over requests executing] width * estimatedDuration) among non-empty queues (#102859, @MikeSpreitzer) [SIG API Machinery and Instrumentation]
    • Adding --restart-kubelet flag on E2E Node test suite (#97028, @knabben) [SIG Node and Testing]

    • Adds feature gate KubeletInUserNamespace which enables support for running kubelet in a user namespace.

      The user namespace has to be created before running kubelet. All the node components such as CRI need to be running in the same user namespace.

      When the feature gate is enabled, kubelet ignores errors that happens during setting the following sysctl values: vm.overcommit_memory, vm.panic_on_oom, kernel.panic, kernel.panic_on_oops, kernel.keys.root_maxkeys, kernel.keys.root_maxbytes. (These sysctl values for the host, not for the containers)

      kubelet also ignores an error during opening /dev/kmsg. This feature gate also allows kube-proxy to ignore an error during setting RLIMIT_NOFILE.

      This feature gate is especially useful for running Kubernetes inside Rootless Docker/Podman with kind or minikube. (#92863, @AkihiroSuda) [SIG Network, Node and Testing]

    • Adds metrics for the delegated authenticator used by extension APIs that delegate authentication logic to the Kube API server. (#99364, @p0lyn0mial)

    • Adds metrics for the delegated authorizer used by extension APIs that delegate authorization logic to the Kube API server. (#100339, @p0lyn0mial)

    • Adds two kubemark flags, --max-pods and --extended-resources. (#100267, @Jeffwan)

    • An audit log entry will be generated when a ValidatingAdmissionWebhook is failing to open. (#92739, @cnphil)

    • Base images: Updated to

    • Base-images: Update to debian-base:buster-v1.7.1 (#102594, @mengjiao-liu)

    • Deprecated warning message for igonre-errors flag. (#102677, @yuzhiquan)

    • Endpoints that have more than 1000 endpoints will be truncated and the endpoints.kubernetes.io/over-capacity annotation on the Endpoints resource will be set to truncated. (#103520, @swetharepakula) [SIG Apps and Network]

    • Expose /debug/flags/v to allow dynamically setting log level for kube-proxy. (#98306, @borgerli) [SIG Network]

    • Expose container start time as container_start_time_seconds in the kubelet /metrics/resource endpoint. (#102444, @sanwishe)

    • Extended resources defined in LeastAllocated, MostAllocated and RequestedToCapacityRatio plugin argument are bypassed by the scheduler if the incoming Pod doesn’t request them in the pod spec. (#103169, @Huang-Wei)

    • Feat: change parittion style to GPT on Windows (#101412, @andyzhangx) [SIG Storage and Windows]

    • Features gates EndpointSliceProxying & WindowsEndpointSliceProxying graduates to GA and are unconditionally enabled. Kube-proxy will use EndpointSlices for endpoint information. (#103451, @swetharepakula)

    • Fluentd: isolate logging resources in separate namespace logging (#68004, @saravanan30erd)

    • For kubeadm: add --validity-period flag for kubeadm kubeconfig user command. (#100907, @SataQiu)

    • Implement minReadySeconds for the StatefulSets. (#101316, @ravisantoshgudimetla)

    • Improve logging of APIService availability changes in kube-apiserver. (#101420, @sttts)

    • Introduce a feature gate DisableCloudProviders allowing to disable cloud-provider initialization in KAPI, KCM and kubelet. DisableCloudProviders FeatureGate is currently in Alpha, which means is currently disabled by default. Once the FeatureGate moves to beta, in-tree cloud providers would be disabled by default, and a user won’t be able to specify --cloud-provider=<aws|openstack|azure|gcp|vsphere> anymore to any of KCM, KAPI or kubelet. Only a ‘–cloud-provider=external’ would be allowed. CCM would have to run out-of-tree with CSI. (#100136, @Danil-Grigorev)

    • JSON logging format is no longer available by default in non-core Kubernetes Components and require owners to opt in. (#102869, @mengjiao-liu) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Kube-apiserver: the alpha PodSecurity feature can be enabled by passing --feature-gates=PodSecurity=true, and enables controlling allowed pods using namespace labels. See https://git.k8s.io/enhancements/keps/sig-auth/2579-psp-replacement for more details. (#103099, @liggitt) [SIG API Machinery, Auth, Instrumentation, Release, Security and Testing]

    • Kube-proxy uses V1 EndpointSlices. (#103306, @swetharepakula)

    • Kubeadm: Add the RootlessControlPlane kubeadm specific feature gate (Alpha in 1.22, disabled by default). It can be used to enable an experimental feature that makes the control plane component static Pod containers for kube-apiserver, kube-controller-manager, kube-scheduler and etcd to run as a non-root users. (#102158, @vinayakankugoyal)

    • Kubeadm: Set the seccompProfile to runtime/default in the PodSecurityContext of the control-plane components that run as static Pods. (#100234, @vinayakankugoyal)

    • Kubeadm: add a new field skipPhases to v1beta3 InitConfiguration and JoinConfiguration that can contain a list of phases to skip during “kubeadm init” and “kubeadm join”. The flag “–skip-phases” takes precedence over this field. (#101923, @neolit123)

    • Kubeadm: add the --dry-run flag to the control-plane phase of “kubeadm init”. (#102722, @vinayakankugoyal)

    • Kubeadm: add the imagePullPolicy field in the nodeRegistration section of InitConfiguration and JoinConfiguration in v1beta3. This allows the user to specify the image pull policy during “kubeadm init” and “kubeadm join”. The value of this field must be one of Always, IfNotPresent or Never. The default behavior continues to be IfNotPresent. (#102901, @wangyysde)

    • Kubeadm: during “kubeadm init/join/upgrade”, always default the cgroupDriver value in the KubeletConfiguration to systemd, unless the user was explicit about the value. See configure-cgroup-driver for more details. (#102133, @pacoxu)

    • Kubeadm: update CoreDNS to 1.8.4. Grant CoreDNS permissions to “list” and “watch” EndpointSlice objects to accommodate dual-stack support. (#102466, @pacoxu)

    • Kubectl: add LAST RESTART column to kubectl get pods output. (#100142, @Ethyling)

    • Kubemark’s hollow-node will now print flags before starting. (#101181, @mm4tt)

    • Kubernetes is now built with Golang 1.16.3 (#101206, @justaugustus) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Kubernetes is now built with Golang 1.16.4 (#101809, @justaugustus) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Kubernetes is now built with Golang 1.16.5. (#102689, @cpanato)

    • Kubernetes is now built with Golang 1.16.6 (#103669, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Leader Migration for controller managers graduated to beta. (#103533, @jiahuif) [SIG API Machinery and Cloud Provider]

    • Make kubectl command headers default for beta. (#103238, @seans3) [SIG CLI]

    • Mark net.ipv4.ip_unprivileged_port_start as safe sysctl. (#103326, @pacoxu)

    • Metrics server nanny has now poll period set to 30s (previously 5 minutes) to allow faster scaling of metrics server. (#101869, @olagacek) [SIG Cloud Provider and Instrumentation]

    • NetworkPolicy validation framework support for windows. (#98077, @jayunit100)

    • New feature gate ExpandedDNSConfig is now available. This feature allows Kubernetes to have expanded DNS configuration. (#100651, @gjkim42)

    • New metrics: apiserver_kube_aggregator_x509_missing_san_total and apiserver_webhooks_x509_missing_san_total. This metric measures a number of connections to webhooks/aggregated API servers that use certificates without Subject Alternative Names. It being non-zero is a warning sign that these connections will stop functioning in the future since Golang is going to deprecate x509 certificate subject Common Names for server hostname verification. (#95396, @stlaz) [SIG API Machinery, Auth and Instrumentation]

    • Node Problem Detector is now available for GCE Windows nodes. (#101539, @jeremyje) [SIG Cloud Provider, Node and Windows]

    • Promote Cronjobs storage version to batch/v1. (#102363, @mengjiao-liu)

    • Promote CronJobControllerV2 flag to GA, with removal in 1.23. (#102529, @soltysh)

    • Promote EndpointSliceTerminatingCondition to Beta. This enables the terminating and serving conditions for EndpointSlice by default. (#103596, @andrewsykim)

    • Run etcd as non-root on GCE provider (#100635, @cindy52)

    • Scheduler nows provides an option for plugin developers to move Pods to activeQ. (#103383, @Huang-Wei)

    • Secret values are now masked by default in kubectl diff output. (#96084, @loozhengyuan)

    • Services with externalTrafficPolicy: Local now support graceful termination when using the iptables or ipvs mode of kube-proxy with EndpointSlices enabled. Specifically, if a connection for such a service arrives on a node when there are no “Ready” endpoints for the service, but there is at least one Terminating pod for that service on the node, then kube-proxy will send the traffic to the Terminating pod rather than dropping it. This patches up a race condition between when a pod is killed and when the external load balancer notices that it has been killed. (#97238, @andrewsykim)

    • Shell completion has been migrated to Cobra’s go solution. kubectl is now smarter about disabling file completion when it does not apply. Furthermore, completion for the cp command does not show all files unless the user has started typing something. (#96087, @marckhouzam) [SIG CLI]

    • Some of the in-tree storage drivers indicate support for the MetricsProvider interface, but fail to configure this for BlockMode volumes. With a recent change, Kubelet will call GetMetrics() for BlockMode volumes, and the in-tree drivers that miss the support cause a Go panic. Now the in-tree storage drivers that support BlockMode volumes, will return the Capacity of the volume in the GetMetrics() call. (#101587, @nixpanic)

    • Support FakeClientset match subresource. (#100939, @wzshiming)

    • The “Leader Migration” now support a wildcard component name and the default value. (#102711, @jiahuif)

    • The CSI driver supports the NodeServiceCapability VOLUME_MOUNT_GROUP and the DelegateFSGroupToCSIDriver feature gate is enabled, kubelet will delegate applying FSGroup to the driver by passing it to NodeStageVolume and NodePublishVolume, regardless of what other FSGroup policies are set, this is an alpha feature. (#103244, @verult)

    • The Memory Manager feature graduates to Beta and it is enabled by default. (#101947, @cynepco3hahue)

    • The BoundServiceAccountTokenVolume graduates to GA and thus will be unconditionally enabled. The feature gate is going to be removed in 1.23. (#101992, @zshihang)

    • The EmptyDir memory backed volumes are sized as the the minimum of pod allocatable memory on a host and an optional explicit user provided value. (#101048, @dims)

    • The HugePageStorageMediumSize feature graduates to GA and unconditionally enabled. Allowing unconditional usage of multiple sizes huge page resources on a container level. (#99144, @bart0sh)

    • The IngressClassNamespacedParams feature gate has graduated to beta and is enabled by default. This means IngressClass resource will now have two new fields - spec.paramters.namespace and spec.parameters.scope. (#101711, @hbagdi)

    • The LogarithmicScaleDown feature graduates to Beta and enabled by default. (#101767, @damemi)

    • The NamespaceDefaultLabelName is promoted to GA in this release. All Namespace API objects have a kubernetes.io/metadata.name label matching their metadata.name field to allow selecting any namespace by its name using a label selector. (#101342, @rosenhouse)

    • The ServiceInternalTrafficPolicy feature graduates to Beta and enable by default, which enables the internalTrafficPolicy field of Service by default. (#103462, @andrewsykim)

    • The ServiceLBNodePortControl graduates to Beta and is enabled by default. (#100412, @hanlins)

    • The SetHostnameAsFQDN graduates to GA and thus will be unconditionally disabled. (#101294, @javidiaz)

    • The WarningHeader feature is now GA and is unconditionally enabled. The apiserver_requested_deprecated_apis metric has graduated to stable status. The WarningHeader feature-gate is no longer operative and will be removed in v1.24. (#100754, @liggitt) [SIG API Machinery, Instrumentation and Testing]

    • The kubectl debug is able to create ephemeral containers in pre-1.22 clusters with the EphemeralContainers feature enabled. Note that versions of kubectl prior to 1.22 are unable to create ephemeral containers in clusters version 1.22 and greater due to an API change. (#103292, @verb)

    • The client-go credential plugins are now GA and are enabled by default. (#102890, @ankeesler)

    • The feature gate SSA graduated to GA in v1.22 and therefore is unconditionally enabled. (#100139, @Jefftree)

    • The job controller removes running pods when the number of completions is achieved. (#99963, @alculquicondor)

    • The kubeconfig is now exposed in the kube-scheduler framework handle. Out-of-tree plugins can leverage that to build CRD informers easily. (#100644, @Huang-Wei)

    • The new flag --chunk-size=SIZE for kubectl drain has been promoted to beta, and enabled by default. This flag may be used to alter the number of items or disable this feature when 0 is passed. (#100148, @KnVerey)

    • The new flag --chunk-size=SIZE has been added to kubectl describe. This flag may be used to alter the number of items or disable this feature when 0 is passed. (#101171, @KnVerey)

    • The pod resource API will provide memory manager metrics in the case when the memory manager feature gate is enabled, and the memory manager policy is static. (#101030, @cynepco3hahue)

    • The prefer nominated node graduates to Beta and enabld by default. (#102201, @chendave)

    • Update etcd version to 3.5.0-beta.3. (#102062, @serathius)

    • Update the Debian images to pick up CVE fixes in the base images:

      • Update the debian-base image to v1.7.0
      • Update the debian-iptables image to v1.6.1 (#102302, @xmudrii)
    • Update the setcap image to buster-v2.0.1. (#102377, @xmudrii)

    • Update the system-validators library to v1.5.0. Includes validation for seccomp and fixes a stdout/stderr problem in the Docker validator. (#103390, @ironyman)

    • Updates the following images to pick up CVE fixes:

      • debian to v1.8.0
      • debian-iptables to v1.6.5
      • setcap to v2.0.3 (#103235, @thejoycekung) [SIG API Machinery, Release and Testing]
    • Warnings for the use of deprecated and known-bad values in pod specs are now sent. (#101688, @liggitt)

    • Watch requests are now handled throttled by priority and fairness filter in kube-apiserver. (#102171, @wojtek-t)

    • You can use this Builder function to create events Field Selector (#101817, @cndoit18) [SIG API Machinery and Scalability]

    • Scheduler now registers event handlers dynamically. (#101394, @Huang-Wei)

    • kubectl: Enable using protocol buffers to request Metrics API. (#102039, @serathius)

    Documentation

    • The commandkubectl debug will now print a warning message when using the --target option since many container runtimes do not support this yet. (#101074, @verb)

    Failing Test

    • Fixes hostpath storage e2e tests within SELinux enabled env (#105786, @Elbehery) [SIG Testing]
    • Fixed generic ephemeal volumes with OwnerReferencesPermissionEnforcement admission plugin enabled. (#101186, @jsafrane)
    • Fixes kubectl drain --dry-run=server. (#100206, @KnVerey)
    • Fixes an overly restrictive conformance test to accept service account tokens signed by an ECDSA key (#100680, @smira) [SIG Architecture, Auth and Testing]
    • Fixes the should receive events on concurrent watches in same order conformance test to work properly on clusters that auto-create additional configmaps in namespaces. (#101950, @liggitt)
    • Resolves an issue with the “ServiceAccountIssuerDiscovery should support OIDC discovery” conformance test failing on clusters which are configured with issuers outside the cluster (#101589, @mtaufen) [SIG Auth and Testing]

    Other (Cleanup or Flake)

    • Updates konnectivity-network-proxy to v0.0.27. This includes a memory leak fix for the network proxy (#107187, @rata) [SIG API Machinery, Auth and Cloud Provider]

    Bug or Regression

    • An inefficient lock in EndpointSlice controller metrics cache has been reworked. Network programming latency may be significantly reduced in certain scenarios, especially in clusters with a large number of Services. (#107168, @robscott) [SIG Apps and Network]

    • Client-go: fix that paged list calls with ResourceVersionMatch set would fail once paging kicked in. (#107335, @fasaxc) [SIG API Machinery]

    • Fix a panic when using invalid output format in kubectl create secret command (#107346, @rikatz) [SIG CLI]

    • Fix: azuredisk parameter lowercase translation issue (#107429, @andyzhangx) [SIG Cloud Provider and Storage]

    • Fixes a rare race condition handling requests that timeout (#107459, @liggitt) [SIG API Machinery]

    • Mount-utils: Detect potential stale file handle (#107039, @andyzhangx) [SIG Storage]

    • A pod that the Kubelet rejects was still considered as being accepted for a brief period of time after rejection, which might cause some pods to be rejected briefly that could fit on the node. A pod that is still terminating (but has status indicating it has failed) may also still be consuming resources and so should also be considered. (#104918, @ehashman) [SIG Node]

    • Fix: skip instance not found when decoupling vmss from lb (#105836, @nilo19) [SIG Cloud Provider]

    • Kubeadm: allow the “certs check-expiration” command to not require the existence of the cluster CA key (ca.key file) when checking the expiration of managed certificates in kubeconfig files. (#106930, @neolit123) [SIG Cluster Lifecycle]

    • Kubeadm: during execution of the “check expiration” command, treat the etcd CA as external if there is a missing etcd CA key file (etcd/ca.key) and perform the proper validation on certificates signed by the etcd CA. Additionally, make sure that the CA for all entries in the output table is included - for both certificates on disk and in kubeconfig files. (#106925, @neolit123) [SIG Cluster Lifecycle]

    • Respect grace period when updating static pods. (#106394, @gjkim42) [SIG Node and Testing]

    • Reverts graceful node shutdown to match 1.21 behavior of setting pods that have not yet successfully completed to “Failed” phase if the GracefulNodeShutdown feature is enabled in kubelet. The GracefulNodeShutdown feature is beta and must be explicitly configured via kubelet config to be enabled in 1.21+. This changes 1.22 and 1.23 behavior on node shutdown to match 1.21. If you do not want pods to be marked terminated on node shutdown in 1.22 and 1.23, disable the GracefulNodeShutdown feature. (#106899, @bobbypage) [SIG Node]

    • Scheduler’s assumed pods have 2min instead of 30s to receive nodeName pod updates (#106633, @ahg-g) [SIG Scheduling]

    • EndpointSlice Mirroring controller now cleans up managed EndpointSlices when a Service selector is added (#106132, @robscott) [SIG Apps, Network and Testing]

    • Fix a bug that --disabled-metrics doesn’t function well. (#105793, @Huang-Wei) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Fix a panic in kubectl when creating secrets with an improper output type (#106356, @lauchokyip) [SIG CLI]

    • Fix concurrent map access causing panics when logging timed-out API calls. (#106112, @marseel) [SIG API Machinery]

    • Fix kube-proxy regression on UDP services because the logic to detect stale connections was not considering if the endpoint was ready. (#106239, @aojea) [SIG Network and Testing]

    • Fix scoring for NodeResourcesBalancedAllocation plugins when nodes have containers with no requests. (#106081, @ahmad-diaa) [SIG Scheduling]

    • Support more than 100 disk mounts on Windows (#105673, @andyzhangx) [SIG Storage and Windows]

    • The –leader-elect* CLI args are now honored correctly in scheduler. (#106130, @Huang-Wei) [SIG Scheduling]

    • The kube-proxy sync_proxy_rules_iptables_total metric now gives the correct number of rules, rather than being off by one.

      Fixed multiple iptables proxy regressions introduced in 1.22:

      • When using Services with SessionAffinity, client affinity for an endpoint now gets broken when that endpoint becomes non-ready (rather than continuing until the endpoint is fully deleted).

      • Traffic to a service IP now starts getting rejected (as opposed to merely dropped) as soon as there are no longer any usable endpoints, rather than waiting until all of the terminating endpoints have terminated even when those terminating endpoints were not being used.

      • Chains for endpoints that won’t be used are no longer output to iptables, saving a bit of memory/time/cpu. (#106373, @aojea) [SIG Network]

    • Watch requests that are delegated to aggregated apiservers no longer reserve concurrency units (seats) in the API Priority and Fairness dispatcher for their entire duration. (#105827, @benluddy) [SIG API Machinery]

    • Fix Job tracking with finalizers for more than 500 pods, ensuring all finalizers are removed before counting the Pod. (#104876, @alculquicondor) [SIG Apps]

    • Fix: skip case sensitivity when checking Azure NSG rules fix: ensure InstanceShutdownByProviderID return false for creating Azure VMs (#104446, @feiskyer) [SIG Cloud Provider]

    • Fixed occasional pod cgroup freeze when using cgroup v1 and systemd driver. (#104529, @kolyshkin) [SIG Node]

    • Fixes a regression that could cause panics in LRU caches in controller-manager, kubelet, kube-apiserver, or client-go EventSourceObjectSpamFilter (#104469, @liggitt) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation and Storage]

    • When using kubectl replace (or the equivalent API call) on a Service, the caller no longer needs to do a read-modify-write cycle to fetch the allocated values for .spec.clusterIP and .spec.ports[].nodePort. Instead the API server will automatically carry these forward from the original object when the new object does not specify them. (#104672, @thockin) [SIG Network]

    • Fix kube-apiserver metric reporting for the deprecated watch path of /api//watch/… (#104188, @wojtek-t) [SIG API Machinery and Instrumentation]

    • Kube-proxy: delete stale conntrack UDP entries for loadbalancer ingress IP. (#104009, @aojea) [SIG Network]

    • Pass additional flags to subpath mount to avoid flakes in certain conditions (#104346, @mauriciopoppe) [SIG Storage]

    • Added jitter factor to lease controller that better smears load on kube-apiserver over time. (#101652, @marseel) [SIG API Machinery and Scalability]

    • Added privileges for EndpointSlice to the default view & edit RBAC roles. (#101203, @mtougeron)

    • After DBus restarts, make GracefulNodeShutdown work again (#100369, @wzshiming)

    • Aggregate errors when putting vmss. (#98350, @nilo19)

    • Aggregate write permissions on events to users with edit and admin role. (#102858, @tumido)

    • Aggregated roles no longer include write access to EndpointSlices. This rolls back part of a change that was introduced earlier in the Kubernetes 1.22 cycle. (#103703, @robscott)

    • Applying fix for not deleting existing public IP when a service is deleted in Azure. (#100694, @nilo19)

    • Applying fix for not tagging static public IP. (#101752, @nilo19)

    • Applying fix so that deleting non-existing disk returns success. (#102083, @andyzhangx)

    • Applying fix: cleanup outdated routes. (#102935, @nilo19)

    • Avoid caching the Azure VMSS instances whose network profile is nil (#100948, @feiskyer) [SIG Cloud Provider]

    • Azure: Avoid setting cached Sku when updating VMSS and VMSS instances. (#102005, @feiskyer)

    • Azurefile: Normalize share name to not include the capital letters (#100731, @kassarl)

    • Chain the field manager creation calls in newDefaultFieldManager to be explicit about the order of operations. (#101076, @kevindelgado)

    • Disruption controller shouldn’t error while syncing for unmanaged pods. (#103414, @ravisantoshgudimetla) [SIG Apps and Testing]

    • Ensure service is deleted when the Azure resource group has been deleted. (#100944, @feiskyer)

    • Ensures ExecProbeTimeout=false kubelet feature gate with dockershim is taken into account, when the exec probe takes longer than timeoutSeconds configuration. (#100200, @jackfrancis)

    • Expose rest_client_rate_limiter_duration_seconds metric to component-base to track client side rate limiter latency in seconds. Broken down by verb and URL. (#100311, @IonutBajescu) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Fire an event when failing to open NodePort. (#100599, @masap)

    • Fix Azure node public IP fetching issues from instance metadata service when the node is part of standard load balancer backend pool. (#100690, @feiskyer) [SIG Cloud Provider]

    • Fix EndpointSlice describe panic when an Endpoint doesn’t have zone. (#101025, @tnqn)

    • Fix kubectl set env or resources not working for initcontainers. (#101669, @carlory)

    • Fix kubectl alpha debug node does not work on tainted(NoExecute) nodes and tolerate everything. (#98431, @wawa0210)

    • Fix a bug on the endpointslicemirroring controller where endpoint NotReadyAddresses were mirrored as Ready to the corresponding EndpointSlice. (#102683, @aojea)

    • Fix a bug that a preemptor pod may exist as a phantom in the scheduler. (#102498, @Huang-Wei)

    • Fix a number of race conditions in the kubelet when pods are starting up or shutting down that might cause pods to take a long time to shut down. (#102344, @smarterclayton) [SIG Apps, Node, Storage and Testing]

    • Fix an issue with kubectl on certain older version of Windows or when legacy console mode is enabled on Windows 8 which causes kubectl exec to crash. (#102825, @n4j)

    • Fix availability set cache in vmss cache (#100110, @CecileRobertMichon) [SIG Cloud Provider]

    • Fix how nulls are handled in array and objects in json patches. (#102467, @pacoxu)

    • Fix panic when kubectl create ingress has annotation flag and an empty value set. (#101377, @rikatz)

    • Fix performance regression for update and apply operations on large CRDs. (#103318, @jpbetz) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation and Storage]

    • Fix raw block mode CSI NodePublishVolume stage miss pod info. (#99069, @phantooom)

    • Fix resource enforcement when using systemd cgroup driver (#102147, @kolyshkin)

    • Fix rounding of volume storage requests. (#100100, @maxlaverse)

    • Fix runtime container status for PostStart hook error. (#100608, @pacoxu)

    • Fix scoring for NodeResourcesMostAllocated and NodeResourcesBalancedAllocation plugins when nodes have containers with no requests. This was leaving to under-utilization of small nodes. (#102925, @alculquicondor)

    • Fix the code is leaking the defaulting between unrelated pod instances. (#103284, @kebe7jun) [SIG CLI]

    • Fix winkernel kube-proxy to only use dual stack when host and networking supports it (#101047, @jsturtevant) [SIG Network and Windows]

    • Fix: Azure file inline volume namespace issue in CSI migration translation (#101235, @andyzhangx)

    • Fix: Bug in kube-proxy latency metrics to calculate only the latency value for the Endpoints that are created after it starts running. This is needed because all the Endpoints objects are processed on restarts, independently when they were. (#100861, @aojea)

    • Fix: avoid nil-pointer panic when checking the frontend IP configuration (#101739, @nilo19) [SIG Cloud Provider]

    • Fix: display of Job completion mode in kubectl describe. (#101160, @alculquicondor)

    • Fix: return empty VMAS name if using standalone VM (#103470, @nilo19) [SIG Cloud Provider]

    • Fix: set “host is down” as corrupted mount. When SMB server is down, there is no way to terminate pod which is using SMB mount, would get an error. (#101398, @andyzhangx)

    • Fix: using NVMe AWS EBS volumes partitions. (#100500, @jsafrane)

    • Fixed ‘kubelet’ runtime panic for timed-out portforward streams. (#102489, @saschagrunert)

    • Fixed SELinux relabeling of CSI volumes after CSI driver failure. (#103154, @jsafrane) [SIG Node and Storage]

    • Fixed garbage collection of dangling VolumeAttachments for PersistentVolumes migrated to CSI on startup of kube-controller-manager. (#102176, @timebertt)

    • Fixed port-forward memory leak for long-running and heavily used connections. (#99839, @saschagrunert)

    • Fixed a bug due to which the controller was not populating the lastSuccessfulTime field added to cronjob.status in batch/v1. (#102642, @alaypatel07)

    • Fixed a bug that kubectl create configmap always returns zero exit code when failed. (#101780, @nak3) [SIG CLI]

    • Fixed a bug that scheduler extenders are not called on preemptions. (#103019, @ordovicia)

    • Fixed a bug where startupProbe stopped working after a container’s first restart. (#101093, @wzshiming)

    • Fixed an issue blocking azure auth to prompt to device code authentication flow when refresh token expires. (#102063, @tdihp)

    • Fixed false-positive uncertain volume attachments, which led to unexpected detachment of CSI migrated volumes (#101737, @Jiawei0227) [SIG Apps and Storage]

    • Fixed mounting of NFS volumes when IPv6 address is used as a server. (#101067, @Elbehery) [SIG Storage]

    • Fixed starting new pods after previous pod timed out unmounting its volumes. (#100183, @jsafrane)

    • Fixed very rare volume corruption when a pod is deleted while kubelet is offline. (#102059, @jsafrane)

    • Fixes a data race issue in the priority and fairness API server filter. (#100638, @tkashem)

    • Fixes issue with websocket-based watches of Service objects not closing correctly on timeout. (#102539, @liggitt)

    • For kubeadm: support for custom imagetags for etcd images which contain build metadata, when imagetags are in the form of version_metadata. For instance, if the etcd version is v3.4.13+patch.0, the supported imagetag would be v3.4.13_patch.0 (#100350, @jr0d)

    • For vSphere: fix regression during attach disk if datastore is within a storage folder or datastore cluster. (#102892, @gnufied)

    • GCE Windows clusters have their TCP/IP parameters are set to GCE’s recommended values. (#103057, @jeremyje) [SIG Cloud Provider and Windows]

    • GCE Windows will no longer install Docker on containerd nodes. (#101747, @jeremyje) [SIG Cloud Provider and Windows]

    • Generated OpenAPI now correctly specifies 201 as a possible response code for PATCH operations. (#100141, @brendandburns)

    • Graceful termination will now be honored when deleting a collection of pods. (#100101, @deads2k)

    • If kube-proxy mode is userspace do not enable EndpointSlices. (#100913, @JornShen)

    • Kubeadm: allow passing the flag --log-file if --config is passed. If you wish to log to a file you must also pass --logtostderr=false or --alsologtostderr=true. Alternatively you can pipe to a file using “kubeadm … | tee …”. (#101449, @CaoDonghui123)

    • Kubeadm: enable --experimental-patches flag for kubeadm join phase control-plane-join all command. (#101110, @SataQiu)

    • Kubeadm: fix a bug where kubeadm join for control plane nodes would download certificates and keys from the cluster, but would not write publicly readable certificates and public keys with mode 0644 and instead use mode 0600. (#103313, @neolit123)

    • Kubeadm: fix the bug that kubeadm only uses the first hash in caCertHashes to verify the root CA. (#101977, @SataQiu)

    • Kubeadm: remove the “ephemeral_storage” request from the etcd static pod that kubeadm deploys on stacked etcd control plane nodes. This request has caused sporadic failures on some setups due to a problem in the kubelet with cadvisor and the LocalStorageCapacityIsolation feature gate. See this issue for more details: https://github.com/kubernetes/kubernetes/issues/99305 (#102673, @jackfrancis) [SIG Cluster Lifecycle]

    • Kubeadm: when using a custom image repository for CoreDNS kubeadm now will append the coredns image name instead of coredns/coredns, thus restoring the behaviour existing before the v1.21 release. Users who rely on nested folder for the coredns image should set the clusterConfiguration.dns.imageRepository value including the nested path name (e.g using registry.company.xyz/coredns will force kubeadm to use registry.company.xyz/coredns/coredns image). No action is needed if using the default registry (k8s.gcr.io). (#102502, @ykakarap)

    • Kubelet: improve the performance when waiting for a synchronization of the node list with the kube-apiserver. (#99336, @neolit123)

    • Kubelet: the returned value for PodIPs is the same in the Downward API and in the pod.status.PodIPs field (#103307, @aojea)

    • Limit vSphere volume name to 63 characters long. (#100404, @gnufied)

    • Logging for GCE Windows clusters will be more accurate and complete when using Fluent bit. (#101271, @jeremyje)

    • Metrics Server will use Addon Manager 1.8.3 (#103541, @jbartosik) [SIG Cloud Provider and Instrumentation]

    • Output for kubectl describe podsecuritypolicy is now kind specific and cleaner (#101436, @KnVerey)

    • Parsing of cpuset information now properly detects more invalid input such as 1--3 or 10-6. (#100565, @lack)

    • Pods that are known to the kubelet to have previously been Running should not revert to Pending state, the kubelet will now infer a termination. (#102821, @ehashman)

    • Prevent Kubelet stuck in DiskPressure when imagefs.minReclaim is set (#99095, @maxlaverse)

    • Reduces delay initializing on non-AWS platforms docker runtime. (#93260, @nckturner) [SIG Cloud Provider]

    • Register/Deregister Targets in chunks for AWS TargetGroup (#101592, @M00nF1sh) [SIG Cloud Provider]

    • Removed /sbin/apparmor_parser requirement for the AppArmor host validation. This allows using AppArmor on distributions which ship the binary in a different path. (#97968, @saschagrunert) [SIG Node and Testing]

    • Renames the timeout field for the DelegatingAuthenticationOptions to TokenRequestTimeout and set the timeout only for the token review client. Previously the timeout was also applied to watches making them reconnecting every 10 seconds. (#100959, @p0lyn0mial)

    • Reorganized iptables rules to reduce rules in KUBE-SERVICES and KUBE-NODEPORTS. (#96959, @tssurya)

    • Respect annotation size limit for server-side apply updates to the client-side apply annotation. Also, fix opt-out of this behavior by setting the client-side apply annotation to the empty string. (#102105, @julianvmodesto) [SIG API Machinery]

    • Retry FibreChannel devices cleanup after error to ensure FibreChannel device is detached before it can be used on another node. (#101862, @jsafrane)

    • Support correct sorting for cpu, memory, storage, ephemeral-storage, hugepages, and attachable-volumes. (#100435, @lauchokyip)

    • Switch scheduler to generate the merge patch on pod status instead of the full pod (#103133, @marwanad) [SIG Scheduling]

    • The EndpointSlice IP validation now matches Endpoints IP validation. (#101084, @robscott)

    • The kube-apiserver now reports the synthetic verb when logging requests, better explaining the user intent and matching what is reported in the metrics. (#102934, @lavalamp)

    • The kube-controller-manager' sets the upper-bound timeout limit for outgoing requests to 70s. Previously (#99358, @p0lyn0mial)

    • The kube-proxy log now shows the “Skipping topology aware endpoint filtering since no hints were provided for zone” warning under the right conditions. (#101857, @dervoeti)

    • The kubectl create service now respects the namespace flag. (#101005, @zxh326)

    • The kubectl get now truncates multi-line strings to avoid breaking printing (#103514, @soltysh)

    • The kubectl wait --for=delete command now ignores the not found error correctly. (#96702, @lingsamuel)

    • The kubelet now reports distinguishes log messages about certificate rotation for its client cert and server cert separately to make debugging problems with one or the other easier. (#101252, @smarterclayton)

    • The serviceOwnsFrontendIP shouldn’t report error when the public IP doesn’t match. (#102516, @nilo19)

    • The system:aggregate-to-edit role no longer includes write access to the Endpoints API. For new Kubernetes 1.22 clusters, the edit and admin roles will no longer include that access in newly created Kubernetes 1.22 clusters. This will have no affect on existing clusters upgrading to Kubernetes 1.22. To retain write access to Endpoints in the aggregated edit and admin roles for newly created 1.22 clusters, refer to https://github.com/kubernetes/website/pull/29025. (#103704, @robscott) [SIG Auth and Network]

    • The conformance tests:

      • Services should serve multiport endpoints from pods
      • Services should serve a basic endpoint from pods were only validating the API objects, not performing any validation on the actual Services implementation. Those tests now validate that the Services under test are able to forward traffic to the endpoints. (#101709, @aojea) [SIG Network and Testing]
    • The current behavior for Services that IPFamilyPolicy set as PreferDualstack. The current behavior when the cluster is upgraded to dual-stack is:

      • Services that have been set to IPFamilyPolicy = PreferDualstack will be upgraded when the service object is updated. e.g., when a user change a label.

      This behavior will change to:

      • Services that have been set IPFamilyPolicy = PreferDualstack will not be upgraded when the service object is updated. User can still change policy, type etc and existing behaviors remain the same. (#102898, @khenidak) [SIG Network and Testing]
    • The reason and message fields for pod status are no longer reset unless the phase also changes. (#103785, @smarterclayton) [SIG Node]

    • Treat VSphere “File (vmdk path here) was not found” errors as success during volume deletion (#92372, @breunigs) [SIG Cloud Provider and Storage]

    • Update kube-proxy base image debian-iptables to v1.6.2 to pickup documentation \n"- debian-iptables: select nft mode if ntf lines > legacy lines, matching iptables-wrappers" (#102590, @BenTheElder)

    • Update klog v2.9.0. (#102332, @pacoxu)

    • Updated the Graceful Node Shutdown Pod termination reason and message. Updated the Graceful Node Shutdown Pod rejection reason and message. (#102840, @Kissy)

    • Updates dependency sigs.k8s.io/structured-merge-diff to v4.1.1. (#100784, @kevindelgado)

    • Updates hostprocess tests to specify user. (#102965, @jsturtevant)

    • Upgrades functionality of kubectl kustomize as described at https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv4.2.0 (#103419, @natasha41575) [SIG CLI]

    • Upgrades functionality of kubectl kustomize as described at kustomize/v4.1.2 (#101120, @monopole)

    • Upgrading etcd: kubeadm upgrade etcd to 3.4.13-3 (#100612, @pacoxu)

    • Use default timeout of 10s for Azure ACR credential provider. (#100686, @hasheddan) [SIG Cloud Provider]

    • We no longer allow the cluster operator to delete any suggested priority & fairness bootstrap configuration object. If a cluster operator removes a suggested configuration, it will be restored by the apiserver. (#102067, @tkashem)

    • When DisableAcceleratorUsageMetrics is set, do not collect accelerator metrics using cAdvisor. (#101712, @SergeyKanzhelev) [SIG Instrumentation and Node]

    • YAML documents separators ("—") can now be followed by whitespace and comments ("# ….") on the same line. This fixes a bug where documents starting with a comment after the separator were ignored. Other types of content on the same line will result in an error. (#103457, @codearky) [SIG API Machinery]

    • oc describe quota used has the same unit format as hard (#102177, @atiratree) [SIG CLI]

    Other (Cleanup or Flake)

    • Kube-apiserver: sets an upper-bound on the lifetime of idle keep-alive connections and time to read the headers of incoming requests (#103958, @liggitt) [SIG API Machinery and Node]
    • After the deprecation period,now the Kubelet’s --chaos-chance flag are removed. (#101057, @wangyysde) [SIG Node]
    • Allow CSI drivers to just run offline expansion tests. (#102665, @gnufied)
    • Changed buildmode of non static Kubernetes binaries to produce position independent executables (PIE). (#102323, @saschagrunert)
    • Clarified the description of a test in the e2e suite that mentions “SCTP” but is actually intended to be testing the behavior of network plugins that don’t implement SCTP. (#102509, @danwinship)
    • Client-go: reduce verbosity of Starting/Stopping reflector messages to 3 again. (#102788, @pohly)
    • Disable log sampling when using json logging format. (#102620, @serathius)
    • Exposes WithCustomRoundTripper method for specifying a middleware function for custom HTTP behaviour for the delegated auth clients. (#99775, @p0lyn0mial)
    • Fake clients now implement a FakeClient interface (#100940, @markusthoemmes) [SIG API Machinery and Instrumentation]
    • Featuregate ServiceLoadBalancerClass graduates to Beta and is enables by default. (#103129, @XudongLiuHarold)
    • Improve func ToSelectableFields’ performance for event. (#102461, @goodluckbot)
    • Increased CSINodeIDMaxLength from 128 bytes to 192 bytes. Prepare to increase the length limit to 256 bytes in 1.23 release. (#101256, @Jiawei0227)
    • JSON logging now supports having information about source code location in the logging format, source code information is available under the key “caller”. (#102437, @MadhavJivrajani)
    • Kubeadm: move the BootstrapToken* API and related utilities from v1beta3 to a separate API group/version - bootstraptoken/v1. (#102964, @neolit123) [SIG Cluster Lifecycle]
    • Kubeadm: the CriticalAddonsOnly toleration has been removed from kube-proxy DaemonSet (#101966, @SataQiu) [SIG Cluster Lifecycle]
    • Metrics Server updated to use 0.4.4 image that doesn’t depend on deprecated authorization.k8s.io/v1beta1 subjectaccessreviews API version. (#101477, @x13n)
    • Migrate proxy/ipvs/proxier.go logs to structured logging. (#97796, @JornShen)
    • Migrate staging/src/k8s.io/apiserver/pkg/registry logs to structured logging. (#98287, @lala123912)
    • Migrate some log messages to structured logging in pkg/volume/plugins.go. (#101510, @huchengze)
    • Migrate some log messages to structured logging in pkg/volume/volume_linux.go. (#99566, @huchengze)
    • Official binaries now include the golang generated build ID buildid instead of an empty string. (#101411, @saschagrunert)
    • Remove balanced attached node volumes feature. (#102443, @ravisantoshgudimetla)
    • Remove deprecated --generator flag from kubectl autoscale. (#99900, @MadhavJivrajani)
    • Remove the deprecated flag --generator from kubectl create deployment command. (#99915, @BLasan)
    • Remove the duplicate packet import. (#101187, @chuntaochen)
    • Replace go-bindata with //go:embed. (#99829, @palnabarun)
    • The DynamicFakeClient now exposes its tracker via a Tracker() function. (#100085, @markusthoemmes)
    • The VolumeSnapshotDataSource feature gate that is GA since v1.20 is unconditionally enabled, and can no longer be specified via the --feature-gates argument. (#101531, @ialidzhikov) [SIG Storage]
    • The deprecated CRIContainerLogRotation feature-gate has been removed, since the CRIContainerLogRotation feature graduated to GA in 1.21 and was unconditionally enabled. (#101578, @carlory)
    • The deprecated RootCAConfigMap feature-gate has been removed, since the RootCAConfigMap feature graduated to GA in 1.21 and is unconditionally enabled. (#101579, @carlory)
    • The deprecated runAsGroup feature-gate has been removed, since the runAsGroup feature graduated to GA in 1.21. (#101581, @carlory)
    • The etcd client has been updated to 3.5.0; github.com/golang/protobuf, google.golang.org/protobuf, and google.golang.org/grpc have been updated to current versions. (#100488, @liggitt)
    • Update Azure Go SDK to v55.0.0. (#102441, @feiskyer)
    • Update Azure Go SDK version to v53.1.0 (#101357, @feiskyer) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle and Instrumentation]
    • Update CNI plugins to v0.9.1. (#102328, @lentzi90)
    • Update Calico to v3.19.1. (#102386, @JornShen)
    • Update cri-tools dependency to v1.21.0. (#100956, @saschagrunert)
    • Update dep google/gnostic and google/go-cmp to v0.5.5 and updating transitive dependencies protobuf. (#102783, @mcbenjemaa)
    • Update golang.org/x/net to v0.0.0-20210520170846-37e1c6afe023 (#103176, @CaoDonghui123) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Node and Storage]
    • Updated command descriptions and examples for grammar and punctuation consistency. (#103524, @bergerhoffer) [SIG Auth and CLI]
    • Updated pause image to version 3.5, which now runs per default as pseudo user and group 65535:65535. This does not have any effect on remote container runtimes like CRI-O and containerd, which setup the pod sandbox user and group on their own. (#100292, @saschagrunert)
    • Upgrade functionality of kubectl kustomize as described at kustomize/v4.1.3. (#102193, @gautierdelorme)

    Dependencies

    Added

    • github.com/antihax/optional: v1.0.0
    • github.com/benbjohnson/clock: v1.0.3
    • github.com/bits-and-blooms/bitset: v1.2.0
    • github.com/certifi/gocertifi: 2c3bb06
    • github.com/checkpoint-restore/go-criu/v5: v5.0.0
    • github.com/cncf/udpa/go: 5459f2c
    • github.com/cockroachdb/errors: v1.2.4
    • github.com/cockroachdb/logtags: eb05cc2
    • github.com/coredns/caddy: v1.1.0
    • github.com/felixge/httpsnoop: v1.0.1
    • github.com/frankban/quicktest: v1.11.3
    • github.com/getsentry/raven-go: v0.2.0
    • github.com/go-kit/log: v0.1.0
    • github.com/gofrs/uuid: v4.0.0+incompatible
    • github.com/josharian/intern: v1.0.0
    • github.com/jpillora/backoff: v1.0.0
    • github.com/nxadm/tail: v1.4.4
    • github.com/opentracing/opentracing-go: v1.1.0
    • github.com/robfig/cron/v3: v3.0.1
    • github.com/stoewer/go-strcase: v1.2.0
    • go.etcd.io/etcd/api/v3: v3.5.0
    • go.etcd.io/etcd/client/pkg/v3: v3.5.0
    • go.etcd.io/etcd/client/v2: v2.305.0
    • go.etcd.io/etcd/client/v3: v3.5.0
    • go.etcd.io/etcd/pkg/v3: v3.5.0
    • go.etcd.io/etcd/raft/v3: v3.5.0
    • go.etcd.io/etcd/server/v3: v3.5.0
    • go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc: v0.20.0
    • go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp: v0.20.0
    • go.opentelemetry.io/contrib: v0.20.0
    • go.opentelemetry.io/otel/exporters/otlp: v0.20.0
    • go.opentelemetry.io/otel/metric: v0.20.0
    • go.opentelemetry.io/otel/oteltest: v0.20.0
    • go.opentelemetry.io/otel/sdk/export/metric: v0.20.0
    • go.opentelemetry.io/otel/sdk/metric: v0.20.0
    • go.opentelemetry.io/otel/sdk: v0.20.0
    • go.opentelemetry.io/otel/trace: v0.20.0
    • go.opentelemetry.io/otel: v0.20.0
    • go.opentelemetry.io/proto/otlp: v0.7.0
    • go.uber.org/goleak: v1.1.10

    Changed

    Removed

    • github.com/agnivade/levenshtein: v1.0.1
    • github.com/alecthomas/template: fb15b89
    • github.com/andreyvit/diff: c7f18ee
    • github.com/bifurcation/mint: 93c51c6
    • github.com/caddyserver/caddy: v1.0.3
    • github.com/cenkalti/backoff: v2.1.1+incompatible
    • github.com/checkpoint-restore/go-criu/v4: v4.1.0
    • github.com/cheekybits/genny: 9127e81
    • github.com/go-acme/lego: v2.5.0+incompatible
    • github.com/go-bindata/go-bindata: v3.1.1+incompatible
    • github.com/go-openapi/analysis: v0.19.5
    • github.com/go-openapi/errors: v0.19.2
    • github.com/go-openapi/loads: v0.19.4
    • github.com/go-openapi/runtime: v0.19.4
    • github.com/go-openapi/spec: v0.19.5
    • github.com/go-openapi/strfmt: v0.19.5
    • github.com/go-openapi/validate: v0.19.8
    • github.com/gobuffalo/here: v0.6.0
    • github.com/hpcloud/tail: v1.0.0
    • github.com/jimstudt/http-authentication: 3eca13d
    • github.com/klauspost/cpuid: v1.2.0
    • github.com/kr/logfmt: b84e30a
    • github.com/kylelemons/godebug: d65d576
    • github.com/lucas-clemente/aes12: cd47fb3
    • github.com/lucas-clemente/quic-clients: v0.1.0
    • github.com/lucas-clemente/quic-go-certificates: d2f8652
    • github.com/lucas-clemente/quic-go: v0.10.2
    • github.com/markbates/pkger: v0.17.1
    • github.com/marten-seemann/qtls: v0.2.3
    • github.com/mholt/certmagic: 6a42ef9
    • github.com/naoina/go-stringutil: v0.1.0
    • github.com/naoina/toml: v0.1.1
    • github.com/robfig/cron: v1.1.0
    • github.com/satori/go.uuid: v1.2.0
    • github.com/thecodeteam/goscaleio: v0.1.0
    • github.com/tidwall/pretty: v1.0.0
    • github.com/vektah/gqlparser: v1.1.2
    • github.com/willf/bitset: v1.1.11
    • go.etcd.io/etcd: dd1b699
    • go.mongodb.org/mongo-driver: v1.1.2
    • gopkg.in/cheggaaa/pb.v1: v1.0.25
    • gopkg.in/fsnotify.v1: v1.4.7
    • gopkg.in/mcuadros/go-syslog.v2: v2.2.1
    • gopkg.in/resty.v1: v1.12.0
    • k8s.io/heapster: v1.2.0-beta.1

    containerlinux 3033.2.2

    Breaking changes

    • CGroupsV2 are enabled by default. Applications might need to be updated if they don’t have support. There are several known issues:
      • Java applications must use JRE >= 15; Please see OpenJDK upstream issue for more details.

    Security fixes

    Bug fixes

    • SDK: Fixed build error popping up in the new SDK Container because policycoreutils used the wrong ROOT to update the SELinux store (flatcar-linux/coreos-overlay#1502)
    • Fixed leak of SELinux policy store to the root filesystem top directory due to wrong store path in policycoreutils instead of /var/lib/selinux (flatcar-linux/Flatcar#596)
    • Ensured that the /run/xtables.lock coordination file exists for modifications of the xtables backend from containers (must be bind-mounted) or the iptables-legacy binaries on the host (flatcar-linux/init#57)
    • dev container: Fix github URL for coreos-overlay and portage-stable to use repos from flatcar-linux org directly instead of relying on redirects from the kinvolk org. This fixes checkouts with emerge-gitclone inside dev-container. (flatcar-linux/scripts#194)
    • SDK: Fixed build error popping up in the new SDK Container because policycoreutils used the wrong ROOT to update the SELinux store (flatcar-linux/coreos-overlay#1502)
    • arm64: the Polkit service does not crash anymore. (flatcar-linux/Flatcar#156)
    • toolbox: fixed support for multi-layered docker images (toolbox#5)
    • Run emergency.target on ignition/torcx service unit failure in dracut (bootengine#28)
    • Fix vim warnings on missing file, when built with USE=”minimal” (portage-stable#260)
    • The Torcx profile docker-1.12-no got fixed to reference the current Docker version instead of 19.03 which wasn’t found on the image, causing Torcx to fail to provide Docker (PR#1456)
    • Use https protocol instead of git for Github URLs (flatcar-linux/coreos-overlay#1394)

    Changes

    • Backported elf support for iproute2 (flatcar-linux/coreos-overlay#1256)
    • Added GPIO support (coreos-overlay#1236)
    • Enabled SELinux in permissive mode on ARM64 (coreos-overlay#1245)
    • The iptables command uses the nftables kernel backend instead of the iptables backend, you can also migrate to using the nft tool instead of iptables. Containers with iptables binaries that use the iptables backend will result in mixing both kernel backends which is supported but you have to look up the rules separately (on the host you can use the iptables-legacy and friends).
    • Added missing SELinux rule as initial step to resolve Torcx unpacking issue (coreos-overlay#1426)

    Updates

    calico 3.21.3

    BGP Improvements

    For users of BGP you can now view the status of your BGP routers, including session status, RIB / FIB contents, and agent health via the new CalicoNodeStatus API. See the API documentation for more details.

    In addition, you can control BGP advertisement of certain prefixes using the new disableBGPExport option on each IP pool, allowing greater control of your route sharing scheme.

    Pull requests:

    • Added Calico node status resource (CalicoNodeStatus) which represents a collection of status information for a node that Calico reports back to the user for use during troubleshooting. libcalico-go #1502 (@song-jiang)
    • Report node BGP status from calico/node. node #1234 (@song-jiang)
    • Add new syncer for BGP status API. typha #662 (@song-jiang)
    • Don’t export BGP routes for IP pools that have disableBGPExport==true confd #647 (@coutinhop)

    Service-based network policy improvements

    In v3.20, we introduced egress policy rules that can match on Kubernetes services. In v3.21, we improved upon that in two ways. First, you can now use service matches in Calico NetworkPolicy and GlobalNetworkPolicy ingress rules. Second, you can now use service-based network policy rules on Windows nodes.

    Pull requests:

    • Policy ingress rules now support service selectors. felix #3024 (@mgleung)
    • Windows data plane support for Service-based network policy rules felix #2917 (@caseydavenport)
    • Allow services to be specified in the Source field of Ingress rules libcalico-go #1517 (@mgleung)

    Option to run Calico as non-privileged and non-root

    Calico can now optionally run in non-privileged and non-root mode, with some limitations. See the documentation for more information.

    Pull requests:

    • Change node and supporting binary permissions so that they can be run as a non-root user node #1224 (@mgleung)
    • CNI plugin now sets route_localnet=1 for container interfaces cni-plugin #1168 (@mgleung)
    • CNI plugins now have SUID bit set in order to run as non-root cni-plugin #1168 (@mgleung)

    IPReservations API

    You can use the new IPReservations API to reserve certain IP addresses so that they will not be used by Calico IPAM. This allows for fine-grained control of the IP space in your cluster.

    Pull requests:

    • Add support for IPReservations libcalico-go #1509 (@fasaxc)

    Bug fixes

    • Fix a serious regression introduced in v3.21.0 where the datastore watcher could get stuck and report stale information in clusters with >500 policies/pods/etc. The bug was triggered by needing to do a resync (for example after an etcd compaction) when there were enough resources to trigger the list pager. calico #5332 (@robbrockbank)
    • Pass ExceptUpgradeService param to stop-calico.ps1 as well node #1372 (@lmm)
    • Restrict Typha server to FIPS compliant cipher suites. typha #696 (@caseydavenport)
    • Fix log spam from Calico upgrade service for Windows node #1343 (@song-jiang)
    • Increase timeout for setting NetworkUnavailable on shutdown node #1341 (@caseydavenport)
    • Fix potential panic and memory leak in kube-controllers caused by adding and subsequently deleting IPAM blocks kube-controllers #912 (@caseydavenport)
    • IPAM GC correctly handles multiple IP addresses allocated with the same handle ID. kube-controllers #903 (@caseydavenport)
    • Fix bug where invalid port structures were being sent to Felix, preventing pods with hostPorts specified from working. libcalico-go #1545 (@caseydavenport)
    • Downgrade repetitive info level logging in calico/node autodetection code node #1237 (@caseydavenport)
    • Updated ubi base images and CentOS repos to stop CVE false positives from being reported. node #1136 (@coutinhop)
    • Fixed typo in umount command pod2daemon #64 (@ScheererJ)
    • Fixes this bug which caused WireGuard stats to be collected even when WireGuard was disabled. Additionally, the version of the wgctrl dependency has been updated as the previous version caused thread leaks. felix #3057 (@mikestephen)
    • Fix blackhole route table interface matches to handle empty interface regexes. felix #3007 (@robbrockbank)
    • Fix slow performance when updating a Kubernetes namespace when there are many Pods (and in turn, slow startup performance when there are many namespaces). felix #2964 (@fasaxc)
    • Close race condition that could result in an extra IPAM block being allocated to a node. libcalico-go #1488 (@caseydavenport)
    • Fix that podIP annotation could be incorrectly clobbered for stateful set pods: https://github.com/projectcalico/calico/issues/4710 libcalico-go #1472 (@fasaxc)
    • Fix removal of old CNI configuration on name-change cni-plugin #1153 (@caseydavenport)
    • Readiness depends on all syncers typha #613 (@robbrockbank)
    • Exclude RR nodes from BGP full mesh confd #619 (@coutinhop)
    • Fixed a bug in ExternalTrafficPolicy=Local that lead to connection stalling. felix #3015 (@tomastigera)
    • Fixed broken connections when client used the same port to connect to the same backed via a nodeport on different nodes. felix #2983 (@tomastigera)
    • The eBPF mode implementation of DoNotTrack policy was incorrectly allowing an inbound connection through a HostEndpoint, when the HostEndpoint had DoNotTrack policy for the ingress direction but not for egress. For precise compatibility with Calico’s established DoNotTrack semantics, that connection should be disallowed, and now is. (Because of the lack of connection tracking, successful use of DoNotTrack policy to allow flows requires configuring the DoNotTrack policy symmetrically in both directions.) felix #2982 (@neiljerram)

    Other changes

    • Replace github.com/dgrijalva/jwt-go with active fork github.com/golang-jwt/jwt that resolves vulnerability flagged by scanners. libcalico-go #1554 (@lmm)
    • calico/node logs write to /var/log/calico within the container by default, in addition to stdout node #1133 (@song-jiang)
    • Read pod IP information from Amazon VPC CNI annotation, if present on the pod. libcalico-go #1523 (@caseydavenport)
    • Update etcd client version to v3.5.0 libcalico-go #1495 (@Aceralon)
    • Optimize lists and watches made against the Kubernetes API libcalico-go #1484 (@caseydavenport)
    • WorkloadEndpoints now support hostPorts libcalico-go #1471 (@AloysAugustin)
    • Include CNI plugin release v1.0.0 cni-plugin #1141 (@caseydavenport)
    • Allow configuration of num_queues for Calico created veth interfaces cni-plugin #1116 (@arikachen)
    • Typha now gives newly connected clients an extra grace period to catch up after sending the snapshot to reduce the possibility of cyclic disconnects. typha #614 (@fasaxc)
    • Add calico-node upgrade service for upgrades on Windows node #1254 (@lmm)
    • eBPF arm64/aarch64 node #1044 (@frozenprocess)
    • BPF: Endpoints in EndpointsSlices that are not ready are excluded from NAT felix #3017 (@tomastigera)
    • Calico’s eBPF dataplane now fully implements DoNotTrack policy felix #2910 (@neiljerram)
    • Add HostPort support in the gRPC dataplane cni-plugin #1119 (@AloysAugustin)

    app-operator 5.8.0

    Added

    • Support watching app CRs in organization namespace with cluster label selector.
    • Annotate App CRs after bootstrapping chart-operator to trigger reconciliation.
    • Add support for relative URLs in catalog indexes.

    Changed

    • Get tarball URL for chart CRs from index.yaml for better community app catalog support.

    Fixed

    • Embed Chart CRD in app-operator to prevent hitting GitHub API rate limits.
    • When bootstrapping chart-operator the helm release should not include the cluster ID.
    • Fix getting kubeconfig in chart CR watcher.
    • Fix error handling in chart CR watcher when chart CRD not installed.
    • Restrict PSP usage to only named resource.
    • Remove compatible providers validation for AppCatalogEntry as its overly strict.
    • Push image to Docker Hub to not rely on crsync.
    • Fixing patch to not reset fields.
    • Allow usage of chart-operator PSP so it can be bootstrapped.
    • Fix label selector in app values watcher so it supports CAPI clusters.
    • Strip cluster name from App CR name to determine Chart CR name in chart/current.go resource to fix WC app updates.
    • Continue processing AppCatalogEntry CRs if an error occurs.
    • Only show AppCatalogEntry CRs that are compatible with the current provider.
    • For internal catalogs generate tarball URLs instead of checking index.yaml to prevent chicken egg problems in new clusters.

    cert-operator 1.3.0

    Changed

    • Use RenewSelf instead of LookupSelf to prevent expiration of Vault token.

    azure-operator 5.17.0

    Added

    • Add GiantSwarmCluster tag to Vnet.
    • Add support for feature that enables forcing cgroups v1 for Flatcar version 3033.2.0 and above.

    Changed

    • Upgraded to giantswarm/exporterkit v1.0.0
    • Upgraded to giantswarm/microendpoint v1.0.0
    • Upgraded to giantswarm/microkit v1.0.0
    • Upgraded to giantswarm/micrologger v0.6.0
    • Upgraded to giantswarm/versionbundle v1.0.0
    • Upgraded to spf13/viper v1.10.0
    • Make nodepool nodes roll in case the user switches between cgroups v1 and v2
    • Drop dependency on giantswarm/apiextensions/v2
    • Bump k8scloudconfig to disable rpc-statd

    Fixed

    • Fix panic while checking for cgroups version during upgrade.

    chart-operator 2.20.1

    Changed

    • Update Helm to v3.6.3.
    • Use controller-runtime client to remove CAPI dependency.
    • Use apptestctl to install CRDs in integration tests to avoid hitting GitHub rate limits.

    Fixed

    • Fix status resource to use Helm release status if it exists.

    Removed

    • Remove unused helm 2 release collector.

    external-dns 2.9.0

    This release contains some changes to mitigate rate limiting on AWS clusters. Please take note of the defaults for values aws.batchChangeInterval, aws.zonesCacheDuration, externalDNS.interval and externalDNS.minEventSyncInterval. If you already specify --aws-batch-change-interval or --aws-zones-cache-duration, please migrate to the new values aws.batchChangeInterval and aws.zonesCacheDuration.

    Added

    • Allow to set --aws-batch-change-interval through aws.batchChangeInterval value. Default 10s.
    • Allow to set --aws-zones-cache-duration through aws.zonesCacheDuration value. Default 3h.

    Changed

    • Set default externalDNS.interval to 5m.
    • Set default externalDNS.minEventSyncInterval to 30s.
    • Allow setting Route53 credentials (externalDNS.aws_access_key_id and externalDNS.aws_secret_access_key) indepentent from aws.access value.
    • Allow setting the AWS default region (aws.region) indepentent from aws.access value.
    • Allow to omit the --domain-filter flag completely by setting externalDNS.domainFilterList to null.

    azure-scheduled-events 0.6.1

    Added

    • Add priorityClassName: "system-node-critical" to Daemonset to give higher priority during scheduling.

    Changed

    • Reduce resource requests.

    vertical-pod-autoscaler 2.1.1

    Fixed

    • Fix naming of VPA deployments in workload clusters.

    cluster-autoscaler 1.22.2-gs4

    Fixed

    • Updated to correct cluster-autoscaler version
    • Use GS-built 1.22 image to deliver upstream unreleased fix kubernetes/autoscaler#4600

    Added

    • Added support for specifying balance-similar-node-groups flag
  • This release provides support for Kubernetes 1.22, has Control Groups v2 enabled by default and includes the Vertical Pod autoscaler.

    Highlights

    • Kubernetes 1.22 support;
    • Control Groups v2 are enabled by default;
    • rpcbind is disabled by default to mitigate security risks. NFS v2 and v3 are not supported anymore;
    • Security fixes:
      • 44 Linux CVEs;
      • 10 expat;
      • 8 Go CVEs;
      • 5 glibc CVE;
      • 4 Docker CVEs;
      • 3 curl CVEs;
      • 3 vim CVEs;
      • 2 polkit CVE;
      • 2 bash CVEs;
      • 2 binutils CVEs;
      • 3 containerd CVEs;
      • 2 nettle CVEs;
      • 2 SDK: bison CVEs;
      • 1 ca-certificates CVE;
      • 1 util-linux CVE;
      • 1 git CVE;
      • 1 gnupg CVE;
      • 1 libgcrypt CVE;
      • 1 sssd CVE;
      • 1 SDK: perl CVE;

    Warning: This is an alpha preview release intended only for testing Kubernetes v1.22 changes and Control Groups v2 compatibility. Upgrading to or from this version is not supported.

    Warning: Kubernetes v1.22 removed certain APIs and features. More details are available in the upstream blog post.

    Warning: rpcbind is disabled by default to mitigate security risks. Any application which requires it will no longer work. NFS v2 and v3 are such applications and are no longer supported. Please, check if any you have any application which depend on rpcbind before you upgrade.

    Known Issues

    • Java applications are unable to identify memory limits when using a JRE prior to v15 in a Control Groups v2 environment. Support was added in JRE v15 and later. More details are available in the upstream issue. We recommend using the latest LTS JRE available (currently v17) to ensure continued compatibility with future releases.

    Control Groups v1 To ensure a smooth transition, in case you need time to modify applications to make them compatible with Control Groups v2, we provide a mechanism that will allow using Control Groups v1 on specific node pools. More details are available in our documentation.

    Change details

    kubernetes 1.22.6

    What’s New (Major Themes)

    Removal of several beta Kubernetes APIs

    A number of APIs are no longer serving specific Beta versions in favour of the GA version of those APIs. All existing objects can be interacted with via general availability APIs. This removal includes beta versions of ValidatingWebhookConfiguration, MutatingWebhookConfiguration, CustomResourceDefinition, APIService, TokenReview, SubjectAccessReview, CertificateSigningRequest, Lease, Ingress, and IngressClass APIs. For the full list check out Deprecated API Migration Guide and the blog post Kubernetes API and Feature Removals In 1.22: Here’s What You Need To Know.

    Kubernetes release cadence change

    We all have to adapt to change in our lives, and especially so in the past year. The Kubernetes release team was also affected from the COVID-19 pandemic and has listened to its user base regarding the number of releases in a calendar year. From April 23, 2021 it was made official that Kubernetes release cadence has reduced from 4 releases per year to 3 releases per year.

    You can read more in the official blog post Kubernetes Release Cadence Change: Here’s What You Need To Know.

    External credential providers

    Kubernetes client credential plugins have been in beta since 1.11, a few eons ago. With the release of Kubernetes 1.22, this feature set graduates to stable. The GA feature set includes improved support for plugins that provide interactive login flows. This release also contains a number of bug fixes to the feature set. Aspiring plugin authors can look at sample-exec-plugin as a way to get started.

    Related to this topic, the in-tree Azure and GCP authentication plugins have been deprecated in favor of out-of-tree implementations.

    Server-side Apply graduates to GA

    Server-side Apply is a new object merge algorithm, as well as tracking of field ownership, running on the Kubernetes API server. Server-side Apply helps users and controllers manage their resources via declarative configurations. It allows them to create and/or modify their objects declaratively, simply by sending their fully specified intent. After being in beta for a couple releases, Server-side Apply is now generally available.

    Cluster Storage Interface graduations

    CSI support for Windows nodes moves to GA in the 1.22 release. In Kubernetes v1.22, Windows privileged containers are only an alpha feature. To allow using CSI storage on Windows nodes, CSIProxy enables CSI node plugins to be deployed as unprivileged pods, using the proxy to perform privileged storage operations on the node.

    Another feature moving to GA in v1.22 is CSI Service Account Token support. This feature allows CSI drivers to use pods’ bound service account tokens instead of a more privileged identity. It also provides control over to re-publishing these volumes, so that short-lived tokens can be refreshed.

    SIG Windows development tools

    To grow the developer community, SIG Windows released multiple tools. The new tools support multiple CNI providers (Antrea, Calico), can run on multiple platforms (any vagrant compatible provider, such as Hyper-V, VirtualBox, or vSphere). There is also a new way to run bleeding edge Windows features from scratch by compiling the windows kubelet and kube-proxy, then using them along with daily builds of other Kubernetes components.

    Deploy a more secure control plane with kubeadm

    A new alpha feature allows running the kubeadm control plane components as non-root users. This is a long requested security measure in kubeadm. To try it you must enable the kubeadm-specific RootlessControlPlane feature gate. When you deploy a cluster using this alpha feature, your control plane runs with lower privileges.

    A new v1beta3 configuration API. It iterates over v1beta2 by adding some long requested features and deprecating some existing ones. The V1beta3 is now the preferred API version; the v1beta2 API also remains available and is not yet deprecated.

    etcd moves to version 3.5.0

    Kubernetes’ default backend storage, etcd, has a new release 3.5.0 and the community embraced it. The new release comes with improvements to the Security, performance, monitoring and developer experience. There are numerous bug fixes to lease objects causing memory leaks, and compact operation causing deadlocks and more. A couple of new features are also introduced like the migration to structured logging and build in log rotation. The release comes with a detailed future roadmap to implement a solution to traffic overload. A full and detailed list of changes can be read in the 3.5.0 release announcement.

    Kubernetes Node system swap support

    Every system administrator or Kubernetes user has been in the same boat regarding setting up and using Kubernetes: disable swap space. With the release of Kubernetes 1.22, alpha support is available to run nodes with swap memory. This change lets administrators opt in to configuring swap on Linux nodes, treating a portion of block storage as additional virtual memory.

    Cluster-wide seccomp defaults

    A new alpha feature gate SeccompDefault has been added to the kubelet, together with a corresponding command line flag --seccomp-default and kubelet configuration. If both are enabled, then the kubelet’s behavior changes for pods that don’t explicitly set a seccomp profile. With cluster-wide seccomp defaults, the kubelet uses the RuntimeDefault seccomp profile by default, rather than than Unconfined. This allows enhancing the default cluster wide workload security of the Kubernetes deployment. Security administrators will now sleep better knowing there is some security by default for the workloads.

    To learn more about the feature, please refer to the official seccomp tutorial.

    Quality of Service for memory resources

    Originally, Kubernetes used the v1 cgroups API. With that design, the QoS class for a pod only applied to CPU resources (such as cpu_shares). The Kubernetes cgroup manager uses memory.limit_in_bytes in v1 cgroups to limit the memory capacity for a container, and uses oom_scores to recommend an order for killing container processes if an out-of-memory event occurs. This implementation has shortcomings: for Guaranteed pods, memory can not be fully reserved, and the page cache is at risk of being recycled. For Burstable pods, overcommitting memory (setting request less than limit ) could increase the risk of a container being killed when the Linux kernel detects an out of memory condition.

    As an alpha feature, Kubernetes v1.22 can use the cgroups v2 API to control memory allocation and isolation. This feature is designed to improve workload and node availability when there is contention for memory resources.

    API changes and improvements for ephemeral containers

    The API used to create Ephemeral Containers changed in 1.22. The Ephemeral Containers feature is alpha and disabled by default, and the new API does not work with clients that attempt to use the old API.

    For stable features, the kubectl tool follows the Kubernetes version skew policy; however, kubectl v1.21 and older do not support the new API for ephemeral containers. Users who create ephemeral containers using kubectl debug should note that kubectl version 1.22 will attempt to fall back to the old API; older versions of kubectl will not work with cluster versions of 1.22 or later. Please update kubectl to 1.22 if you wish to use kubectl debug with a mix of cluster versions.

    Known Issues

    CPU and Memory manager are not working correctly for Guaranteed Pods with multiple containers

    A regression bug was found where guaranteed Pods with multiple containers do not work properly with set allocations for CPU, Memory, and Device manager. The fix will be availability in coming releases.

    CSIMigrationvSphere feature gate has not migrated to new CRD APIs

    If CSIMigrationvSphere feature gate is enabled, user should not upgrade to Kubernetes v1.22. vSphere CSI Driver does not support Kubernetes v1.22 yet because it uses v1beta1 CRD APIs. Support for v1.22 will be added at a later release. Check the following document for supported Kubernetes releases for a given vSphere CSI Driver version.

    Urgent Upgrade Notes

    (No, really, you MUST read this before you upgrade)
    • Audit log files are now created with a mode of 0600. Existing file permissions will not be changed. If you need the audit file to be readable by a non-root user, you can pre-create the file with the desired permissions. (#95387, @JAORMX) [SIG API Machinery and Auth]
    • CSI migration of AWS EBS volumes requires AWS EBS CSI driver ver. 1.0 that supports allowAutoIOPSPerGBIncrease parameter in StorageClass. (#101082, @jsafrane)
    • Conformance image is now built with Distroless. Users running Conformance testing should rely on container entrypoint instead of manual invocation to /run_e2e.sh or /gorunner, as they are now deprecated and will be removed in 1.25 release. Invoking ginkgo and e2e.test are still supported through overriding entrypoint (docker) or defining container spec.command (kubernetes). (#99178, @wilsonehusin)
    • Default StreamingProxyRedirects to disabled. If there is a >= 2 version skew between master and nodes, and the old nodes were enabling --redirect-container-streaming, this will break them. In this case, the StreamingProxyRedirects can still be manually enabled. (#101647, @pacoxu)
    • Intree volume plugin scaleIO support has been completely removed from Kubernetes. (#101685, @Jiawei0227)
    • Kubeadm: remove the automatic detection and matching of cgroup drivers for Docker. For new clusters if you have not configured the cgroup driver explicitly you might get a failure in the kubelet on driver mismatch (kubeadm clusters should be using the systemd driver). Also remove the IsDockerSystemdCheck preflight check (warning) that checks if the Docker cgroup driver is set to systemd. Ideally such detection / coordination should be on the side of CRI implementers and the kubelet (tracked here). Please see the page on how to configure cgroup drivers with kubeadm manually (#99647, @neolit123)
    • Kubeadm: the flag --cri-socket is no longer allowed in a mixture with the flag --config. Please use the kubeadm configuration for setting the CRI socket for a node using {Init|Join}Configuration.nodeRegistration.criSocket. (#101600, @KofClubs)
    • Newly provisioned PVs by Azure disk will no longer have the beta FailureDomain label. Azure disk volume plugin will start to have GA topology label instead. (#101534, @kassarl)
    • Scheduler’s CycleState now embeds internal read/write locking inside its Read() and Write() functions. Meanwhile, Lock() and Unlock() function are removed. Scheduler plugin developers are now required to remove CycleState#Lock() and CycleState#Unlock(). Just simply use Read() and Write() as they’re natively thread-safe now. (#101542, @Huang-Wei)
    • The CSIMigrationVSphereComplete feature flag is removed. InTreePluginvSphereUnregister will be the way moving forward. (#101272, @Jiawei0227)
    • The flag --experimental-patches is now deprecated and will be removed in a future release. You can migrate to using the new flag --patches. Add a new field {Init|Join}Configuration.patches.directory that can be used for the same purpose. For init and join it is now recommended that you migrate to configure patches via {Init|Join}Configuration.patches.directory. For the time being, these flags can be mixed with --config, but that might change in the future. On a command line, the last *patches flag takes precedence over previous flags and the value in config. kubeadm upgrade --patches will continue to be the only available option, since upgrade does not support a configuration file yet. (#103063, @neolit123)

    Important Security Information

    This release contains changes that address the following vulnerabilities:

    A security issue was discovered in Kubernetes where a user may be able to create a container with subpath volume mounts to access files & directories outside of the volume, including on the host filesystem.

    Affected Versions:

    • kubelet v1.22.0 - v1.22.1
    • kubelet v1.21.0 - v1.21.4
    • kubelet v1.20.0 - v1.20.10
    • kubelet <= v1.19.14

    Fixed Versions:

    • kubelet v1.22.2
    • kubelet v1.21.5
    • kubelet v1.20.11
    • kubelet v1.19.15

    This vulnerability was reported by Fabricio Voznika and Mark Wolters of Google.

    CVSS Rating: High (8.8) CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

    Deprecation

    • Controller-manager: the following flags have no effect and would be removed in v1.24:

      • --port
      • --address The insecure port flags --port may only be set to 0 now.

      In addtion, please be careful that:

      • controller-manager MUST start with --authorization-kubeconfig and --authentication-kubeconfig correctly set to get authentication/authorization working.
      • liveness/readiness probes to controller-manager MUST use HTTPS now, and the default port has been changed to 10257.
      • Applications that fetch metrics from controller-manager should use a dedicated service account which is allowed to access nonResourceURLs /metrics. (#96216, @knight42) [SIG API Machinery, Cloud Provider, Instrumentation and Testing]
    • Deprecate --record flag in kubectl. The --record flag is being replaced with the mechanism which annotates HTTP requests with kubectl command details. (#102873, @soltysh)

    • E2e.test: removed the --viper-config flag. If you were previously using this to pass flags to e2e.test via a file, you will need to pass them directly on the command line, e.g. e2e.test --e2e-output-dir. (#102598, @dims)

    • For kubeadm: remove the ClusterStatus API from v1beta3 and its management in the kube-system/kubeadm-config ConfigMap. This method of keeping track of what API endpoints exists in the cluster was replaced (in a prior release) by a method to annotate the etcd Pods that kubeadm creates in “stacked etcd” clusters. The following CLI sub-phases are deprecated and are now a NO-OP: for kubeadm join: “control-plane-join/update-status”, for kubeadm reset: “update-cluster-status”. Unless you are using these phases explicitly, you should not be affected. (#101915, @neolit123)

    • Kubeadm: remove the deprecated --csr-only and --csr-dir flags from kubeadm init phase certs. Deprecate the same flags under kubeadm certs renew. In both the cases the command kubeadm certs generate-csr should be used instead. (#102108, @neolit123)

    • Kubeadm: Remove the deprecated command kubeadm alpha kubeconfig. Please use kubeadm kubeconfig instead. (#101938, @knight42)

    • Kubeadm: Remove the deprecated hyperkube image support in v1beta3. This implies removal of ClusterConfiguration.UseHyperKubeImage. (#101537, @neolit123)

    • Kubeadm: Remove the field ClusterConfiguration.DNS.Type in v1beta3 since CoreDNS is the only supported DNS type. (#101547, @neolit123)

    • Kubeadm: remove the deprecated command kubeadm config view. A replacement for this command is kubectl get cm -n kube-system kubeadm-config -o=jsonpath="{.data.ClusterConfiguration}" (#102071, @neolit123)

    • Kubeadm: remove the deprecated flag ‘–image-pull-timeout’ for ‘kubeadm upgrade apply’ command (#102093, @SataQiu) [SIG Cluster Lifecycle]

    • Kubeadm: remove the deprecated flag --insecure-port from the kube-apiserver manifest that kubeadm manages. The flag had no effect since 1.20, since the insecure serving of the component was disabled in the same version. (#102121, @pacoxu)

    • Kubeadm: remove the deprecated kubeadm API v1beta1. Introduce a new kubeadm API v1beta3. See kubeadm/v1beta3 for a list of changes since v1beta2. Note that v1beta2 is not yet deprecated, but will be in a future release. (#101129, @neolit123)

    • Newly provisioned PVs by vSphere in-tree plugin will no longer have the beta FailureDomain label. vSphere volume plugin will start to have GA topology label (#102414, @divyenpatel)

    • Removal of the CSI NodePublish path by the kubelet is deprecated. This must be done by the CSI plugin according to the CSI spec. (#101441, @dobsonj)

    • Remove support for the Service topologyKeys field (alpha) and the kube-proxy implementation of it. This field was deprecated several cycles ago. This functionality is replaced by the combination of automatic topology hints per-endpoint (alpha) and the Service internalTrafficPolicy field (alpha). (#102412, @andrewsykim)

    • The PodUnknown phase is now deprecated. (#95286, @SergeyKanzhelev)

    • The storageos, quobyte and flocker storage volume plugins are deprecated and will be removed in a later release. (#101773, @Jiawei0227)

    • The deprecated flag --hard-pod-affinity-symmetric-weight and --scheduler-name have been removed from kube-scheduler. Use ComponentConfig instead to configure those parameters. (#102805, @ahg-g)

    • The feature Dynamic Kubelet Configuration is deprecated and kubelet will report warning when the flag --dynamic-config-dir is used. Feature gate DynamicKubeletConfig is disabled out of the box and needs to be explicitly enabled. (#102966, @SergeyKanzhelev) [SIG Cloud Provider, Instrumentation and Node]

    • The in-tree azure and gcp auth plugins have been deprecated. The https://github.com/Azure/kubelogin and gcloud commands serve as out-of-tree replacements via the kubectl/client-go credential plugin mechanism. (#102181, @enj) [SIG API Machinery and Auth]

    • The ingress v1beta1 has been deprecated. (#102030, @aojea)

    API Change

    • A new score extension for NodeResourcesFit plugin that merges the functionality of NodeResourcesLeastAllocated, NodeResourcesMostAllocated, RequestedToCapacityRatio plugins, which are marked as deprecated as of v1beta2. In v1beta1, the three plugins can still be used in v1beta1 but not at the same time with the score extension of NodeResourcesFit. (#101822, @yuzhiquan)

    • A value of Auto is now a valid for the service.kubernetes.io/topology-aware-hints annotation. (#100728, @robscott)

    • Add DataSourceRef alpha field to PVC spec, which allows contents other than PVCs and VolumeSnapshots to be data sources. (#103276, @bswartz)

    • Add PersistentVolumeClaimDeletePoilcy to StatefulSet API. (#99378, @mattcary)

    • Add a new Priority and Fairness rule that exempts all probes (/readyz, /healthz, /livez) to prevent restarting of healthy kube-apiserver instance by kubelet. (#100678, @tkashem)

    • Add alpha support for HostProcess containers on Windows (#99576, @marosset) [SIG API Machinery, Apps, Node, Testing and Windows]

    • Add distributed tracing to the kube-apiserver. It is can be enabled with the feature gate APIServerTracing (#94942, @dashpole)

    • Add three metrics to the job controller to monitor if a job works in healthy condition. IndexedJob has been promoted to Beta. (#101292, @AliceZhang2016)

    • Added field .status.uncountedTerminatedPods to the Job resource. This field is used by the job controller to keep track of finished pods before adding them to the Job status counters. Pods created by the job controller get the finalizer batch.kubernetes.io/job-tracking Jobs that are tracked using this mechanism get the annotation batch.kubernetes.io/job-tracking. This is a temporary measure. Two releases after this feature graduates to beta, the annotation won’t be added to Jobs anymore. (#98817, @alculquicondor)

    • Added new kubelet alpha feature SeccompDefault. This feature enables falling back to the RuntimeDefault (former runtime/default) seccomp profile if nothing else is specified in the pod/container SecurityContext or the pod annotation level. To use the feature, enable the feature gate as well as set the kubelet configuration option SeccompDefault (--seccomp-default) to true. (#101943, @saschagrunert) [SIG Node]

    • Adds the ReadWriteOncePod access mode for PersistentVolumes and PersistentVolumeClaims. Restricts volume access to a single pod on a single node. (#102028, @chrishenzie)

    • Alpha swap support can now be enabled on Kubernetes nodes with the NodeSwapEnabled feature flag. See KEP-2400 for details. (#102823, @ehashman)

    • Because of the implementation logic of time.Format in golang, the displayed time zone is not consistent. (#102366, @cndoit18)

    • Corrected the documentation for escaping dollar signs in a container’s env, command and args property. (#101916, @MartinKanters) [SIG Apps]

    • Enable MaxSurge for DaemonSet by default. (#101742, @ravisantoshgudimetla)

    • Enforce the ReadWriteOncePod PVC access mode during scheduling (#103082, @chrishenzie)

    • Ephemeral containers are now allowed to configure a securityContext that differs from that of the Pod. Cluster administrators should ensure that security policy controllers support EphemeralContainers before enabling this feature in clusters. (#99023, @verb)

    • Exec plugin authors can override default handling of standard input via new interactiveMode kubeconfig field. (#99310, @ankeesler)

    • If someone had the ProbeTerminationGracePeriod alpha feature enabled in 1.21, they should update/delete any workloads/pods with probe terminationGracePeriods < 1 before upgrading (#103245, @wzshiming)

    • Improved parsing of label selectors (#102188, @alculquicondor) [SIG API Machinery]

    • Introduce minReadySeconds api to the StatefulSets. (#100842, @ravisantoshgudimetla)

    • Introducing Memory quality of service support with cgroups v2 (Alpha). The MemoryQoS feature is now in Alpha. This allows kubelet running with cgroups v2 to set memory QoS at container, pod and QoS level to protect and guarantee better memory quality. This feature can be enabled through feature gate Memory QoS. (#102970, @borgerli)

    • Kube API server accepts Impersonate-Uid header to impersonate a user with a specific UID, in the same way that you can currently use Impersonate-User, Impersonate-Group and Impersonate-Extra. (#99961, @margocrawf)

    • Kube-apiserver: --service-account-issuer can be specified multiple times now, to enable non-disruptive change of issuer. (#101155, @zshihang) [SIG API Machinery, Auth, Node and Testing]

    • Kube-controller-manager: the --horizontal-pod-autoscaler-use-rest-clients flag and Heapster support in the horizontal pod autoscaler, deprecated since 1.12, is removed. (#90368, @serathius)

    • Kube-scheduler: a plugin enabled in a v1beta2 configuration file takes precedence over the default configuration for that plugin. This simplifies enabling default plugins with custom configuration without needing to explicitly disable those default plugins. (#99582, @chendave)

    • New node-high priority-level has been added to Suggested API Priority and Fairness configuration.(#101151, @mborsz)

    • NodeSwapEnabled feature flag was renamed to NodeSwap

      The flag was only available in the 1.22.0-beta.1 release, and the new flag should be used going forward. (#103553, @ehashman) [SIG Node]

    • Omit comparison with boolean constant (#101523, @chuntaochen) [SIG CLI and Cloud Provider]

    • Removed the feature flag for probe-level termination grace period from Kubelet. If a user wants to disable this feature on already created pods, they will have to delete and recreate the pods. (#103168, @raisaat) [SIG Apps and Node]

    • Revert addition of Add PersistentVolumeClaimDeletePoilcy to StatefulSetAPI. (#103747, @mattcary)

    • Scheduler could be configured to consider new resources beside CPU and memory, GPU for example, for the score plugin of NodeResourcesBalancedAllocation. (#101946, @chendave) [SIG Scheduling]

    • Server Side Apply now treats all Selector fields as atomic (meaning the entire selector is managed by a single writer and updated together), since they contain interrelated and inseparable fields that do not merge in intuitive ways. (#97989, @Danil-Grigorev) [SIG API Machinery]

    • Suspend Job feature graduated to beta. Added the action label to Job controller sync metrics job_sync_total and job_sync_duration_seconds. (#102022, @adtac)

    • The API documentation for the DaemonSet’s spec.updateStrategy.rollingUpdate.maxUnavailable field was corrected to state that the value is rounded up. (#101296, @Miciah)

    • The CSIServiceAccountToken graduates to Ga and is unconditionally enabled. (#103001, @zshihang)

    • The CertificateSigningRequest.certificates.k8s.io API supports an optional expirationSeconds field to allow the client to request a particular duration for the issued certificate. The default signer implementations provided by the Kubernetes controller manager will honor this field as long as it does not exceed the –cluster-signing-duration flag. (#99494, @enj)

    • The EndpointSlicen Mirroring controller no longer mirrors the last-applied-configuration annotation created by kubectl to update EndpointSlices. (#102731, @sharmarajdaksh)

    • The NetworkPolicyEndPort is graduated to beta and is enabled by default. (#102834, @rikatz)

    • The PodDeletionCost feature has been promoted to beta, and enabled by default. (#101080, @ahg-g)

    • The Server Side Apply treats certain structs as atomic. Meaning the entire selector field is managed by a single writer and updated together. (#100684, @Jefftree)

    • The ServiceAppProtocol feature gate has been removed. It reached GA in Kubernetes (#103190, @robscott)

    • The TerminationGracePeriodSeconds on pod specs and container probes should not be negative. Negative values of TerminationGracePeriodSeconds will be treated as the value 1s on the delete path. Immutable field validation will be relaxed in order to update negative values. In a future release, negative values will not be permitted. (#98866, @wzshiming)

    • The kube-scheduler component config v1beta2 API available Three scheduler plugins deprecated (NodeLabel, ServiceAffinity, NodePreferAvoidPods). (#99597, @adtac)

    • The pod/eviction subresource now accepts policy/v1 eviction requests in addition to policy/v1beta1 eviction requests (#100724, @liggitt)

    • The podAffinity, NamespaceSelector and the associated CrossNamespaceAffinity quota scope features graduate to Beta and they are now enabled by default. (#101496, @ahg-g)

    • The pods/ephemeralcontainers API now returns and expects a Pod object instead of EphemeralContainers. This is incompatible with the previous alpha-level API. (#101034, @verb) [SIG Apps, Auth, CLI and Testing]

    • The v1.Node and .status.images[].names are now optional. (#102159, @roycaihw)

    • The deprecated flag --algorithm-provider has been removed from kube-scheduler. Use instead ComponentConfig to configure the set of enabled plugins. (#102239, @Haleygo)

    • The options --ssh-user and --ssh-key are removed. They only functioned on GCE, and only in-tree. Use the apiserver network proxy instead. (#102297, @deads2k)

    • Track Job completion through status and Pod finalizers, removing dependency on Pod tombstones. (#98238, @alculquicondor) [SIG API Machinery, Apps, Auth and Testing]

    • Track ownership of scale subresource for all scalable resources i.e. Deployment, ReplicaSet, StatefulSet, ReplicationController, and Custom Resources. (#98377, @nodo) [SIG API Machinery and Testing]

    Feature

    • Kube-apiserver: when merging lists, Server Side Apply now prefers the order of the submitted request instead of the existing persisted object (#107568, @jiahuif) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Storage and Testing]

    • Kubernetes is now built with Golang 1.16.12 (#106982, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Update golang.org/x/net to v0.0.0-20211209124913-491a49abca63 (#106960, @cpanato) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Node and Storage]

    • Kubernetes is now built with Golang 1.16.10 (#106223, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Update debian-base, debian-iptables, setcap images to pick up CVE fixes

      • Debian-base to v1.9.0
      • Debian-iptables to v1.6.7
      • setcap to v2.0.4 (#106143, @cpanato) [SIG Release and Testing]
    • A system-cluster-critical pod should not get a low OOM Score.

      As of now both system-node-critical and system-cluster-critical pods have -997 OOM score, making them one of the last processes to be OOMKilled. By definition system-cluster-critical pods can be scheduled elsewhere if there is a resource crunch on the node where as system-node-critical pods cannot be rescheduled. This was the reason for system-node-critical to have higher priority value than system-cluster-critical. This change allows only system-node-critical priority class to have low OOMScore.

      action required If the user wants to have the pod to be OOMKilled last and the pod has system-cluster-critical priority class, it has to be changed to system-node-critical priority class to preserve the existing behavior (#99729, @ravisantoshgudimetla)

    • API Server tracing can now trace re-entrant api requests. (#103218, @dashpole) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle and Instrumentation]

    • APIServerTracing now collects spans from etcd client calls, and propagates context to etcd. (#103216, @dashpole) [SIG API Machinery, Cloud Provider and Instrumentation]

    • APIServerTracing now collects spans from outgoing requests to admission webhooks. (#103601, @dashpole) [SIG API Machinery]

    • Add a namespace label for all apiserver_admission_* metrics. Expand the histogram range to 0-10s for all apiserver_admission_*_duration_seconds metrics. (#101208, @voutcn)

    • Add unified map on CRI to support cgroup v2. Refer to https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md#unified. (#102578, @payall4u)

    • Added BinaryData description to kubectl describe command. (#100568, @lauchokyip)

    • Added a new metric apiserver_flowcontrol_request_concurrency_in_use that shows the number of seats (concurrency) occupied by the currently executing requests in the API Priority and Fairness system. (#102795, @tkashem)

    • Added field-selector option for kubectl top pod (#102155, @lauchokyip) [SIG CLI]

    • Added new metrics about API Priority and Fairness. Each one has a label priority_level. The last two also have a label bound taking values min and `max.

      • apiserver_flowcontrol_current_r: R(the time of the last change in state of the queues)
      • apiserver_flowcontrol_dispatch_r: R(the time of the latest request dispatch)
      • apiserver_flowcontrol_latest_s: S(the request last dispatched) = R(when that request starts executing in the virtual world)
      • apiserver_flowcontrol_next_s_bounds: min and max next S among non-empty queues
      • apiserver_flowcontrol_next_discounted_s_bounds: min and max next S - (sum [over requests executing] width * estimatedDuration) among non-empty queues (#102859, @MikeSpreitzer) [SIG API Machinery and Instrumentation]
    • Adding --restart-kubelet flag on E2E Node test suite (#97028, @knabben) [SIG Node and Testing]

    • Adds feature gate KubeletInUserNamespace which enables support for running kubelet in a user namespace.

      The user namespace has to be created before running kubelet. All the node components such as CRI need to be running in the same user namespace.

      When the feature gate is enabled, kubelet ignores errors that happens during setting the following sysctl values: vm.overcommit_memory, vm.panic_on_oom, kernel.panic, kernel.panic_on_oops, kernel.keys.root_maxkeys, kernel.keys.root_maxbytes. (These sysctl values for the host, not for the containers)

      kubelet also ignores an error during opening /dev/kmsg. This feature gate also allows kube-proxy to ignore an error during setting RLIMIT_NOFILE.

      This feature gate is especially useful for running Kubernetes inside Rootless Docker/Podman with kind or minikube. (#92863, @AkihiroSuda) [SIG Network, Node and Testing]

    • Adds metrics for the delegated authenticator used by extension APIs that delegate authentication logic to the Kube API server. (#99364, @p0lyn0mial)

    • Adds metrics for the delegated authorizer used by extension APIs that delegate authorization logic to the Kube API server. (#100339, @p0lyn0mial)

    • Adds two kubemark flags, --max-pods and --extended-resources. (#100267, @Jeffwan)

    • An audit log entry will be generated when a ValidatingAdmissionWebhook is failing to open. (#92739, @cnphil)

    • Base images: Updated to

    • Base-images: Update to debian-base:buster-v1.7.1 (#102594, @mengjiao-liu)

    • Deprecated warning message for igonre-errors flag. (#102677, @yuzhiquan)

    • Endpoints that have more than 1000 endpoints will be truncated and the endpoints.kubernetes.io/over-capacity annotation on the Endpoints resource will be set to truncated. (#103520, @swetharepakula) [SIG Apps and Network]

    • Expose /debug/flags/v to allow dynamically setting log level for kube-proxy. (#98306, @borgerli) [SIG Network]

    • Expose container start time as container_start_time_seconds in the kubelet /metrics/resource endpoint. (#102444, @sanwishe)

    • Extended resources defined in LeastAllocated, MostAllocated and RequestedToCapacityRatio plugin argument are bypassed by the scheduler if the incoming Pod doesn’t request them in the pod spec. (#103169, @Huang-Wei)

    • Feat: change parittion style to GPT on Windows (#101412, @andyzhangx) [SIG Storage and Windows]

    • Features gates EndpointSliceProxying & WindowsEndpointSliceProxying graduates to GA and are unconditionally enabled. Kube-proxy will use EndpointSlices for endpoint information. (#103451, @swetharepakula)

    • Fluentd: isolate logging resources in separate namespace logging (#68004, @saravanan30erd)

    • For kubeadm: add --validity-period flag for kubeadm kubeconfig user command. (#100907, @SataQiu)

    • Implement minReadySeconds for the StatefulSets. (#101316, @ravisantoshgudimetla)

    • Improve logging of APIService availability changes in kube-apiserver. (#101420, @sttts)

    • Introduce a feature gate DisableCloudProviders allowing to disable cloud-provider initialization in KAPI, KCM and kubelet. DisableCloudProviders FeatureGate is currently in Alpha, which means is currently disabled by default. Once the FeatureGate moves to beta, in-tree cloud providers would be disabled by default, and a user won’t be able to specify --cloud-provider=<aws|openstack|azure|gcp|vsphere> anymore to any of KCM, KAPI or kubelet. Only a ‘–cloud-provider=external’ would be allowed. CCM would have to run out-of-tree with CSI. (#100136, @Danil-Grigorev)

    • JSON logging format is no longer available by default in non-core Kubernetes Components and require owners to opt in. (#102869, @mengjiao-liu) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Kube-apiserver: the alpha PodSecurity feature can be enabled by passing --feature-gates=PodSecurity=true, and enables controlling allowed pods using namespace labels. See https://git.k8s.io/enhancements/keps/sig-auth/2579-psp-replacement for more details. (#103099, @liggitt) [SIG API Machinery, Auth, Instrumentation, Release, Security and Testing]

    • Kube-proxy uses V1 EndpointSlices. (#103306, @swetharepakula)

    • Kubeadm: Add the RootlessControlPlane kubeadm specific feature gate (Alpha in 1.22, disabled by default). It can be used to enable an experimental feature that makes the control plane component static Pod containers for kube-apiserver, kube-controller-manager, kube-scheduler and etcd to run as a non-root users. (#102158, @vinayakankugoyal)

    • Kubeadm: Set the seccompProfile to runtime/default in the PodSecurityContext of the control-plane components that run as static Pods. (#100234, @vinayakankugoyal)

    • Kubeadm: add a new field skipPhases to v1beta3 InitConfiguration and JoinConfiguration that can contain a list of phases to skip during “kubeadm init” and “kubeadm join”. The flag “–skip-phases” takes precedence over this field. (#101923, @neolit123)

    • Kubeadm: add the --dry-run flag to the control-plane phase of “kubeadm init”. (#102722, @vinayakankugoyal)

    • Kubeadm: add the imagePullPolicy field in the nodeRegistration section of InitConfiguration and JoinConfiguration in v1beta3. This allows the user to specify the image pull policy during “kubeadm init” and “kubeadm join”. The value of this field must be one of Always, IfNotPresent or Never. The default behavior continues to be IfNotPresent. (#102901, @wangyysde)

    • Kubeadm: during “kubeadm init/join/upgrade”, always default the cgroupDriver value in the KubeletConfiguration to systemd, unless the user was explicit about the value. See configure-cgroup-driver for more details. (#102133, @pacoxu)

    • Kubeadm: update CoreDNS to 1.8.4. Grant CoreDNS permissions to “list” and “watch” EndpointSlice objects to accommodate dual-stack support. (#102466, @pacoxu)

    • Kubectl: add LAST RESTART column to kubectl get pods output. (#100142, @Ethyling)

    • Kubemark’s hollow-node will now print flags before starting. (#101181, @mm4tt)

    • Kubernetes is now built with Golang 1.16.3 (#101206, @justaugustus) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Kubernetes is now built with Golang 1.16.4 (#101809, @justaugustus) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Kubernetes is now built with Golang 1.16.5. (#102689, @cpanato)

    • Kubernetes is now built with Golang 1.16.6 (#103669, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]

    • Leader Migration for controller managers graduated to beta. (#103533, @jiahuif) [SIG API Machinery and Cloud Provider]

    • Make kubectl command headers default for beta. (#103238, @seans3) [SIG CLI]

    • Mark net.ipv4.ip_unprivileged_port_start as safe sysctl. (#103326, @pacoxu)

    • Metrics server nanny has now poll period set to 30s (previously 5 minutes) to allow faster scaling of metrics server. (#101869, @olagacek) [SIG Cloud Provider and Instrumentation]

    • NetworkPolicy validation framework support for windows. (#98077, @jayunit100)

    • New feature gate ExpandedDNSConfig is now available. This feature allows Kubernetes to have expanded DNS configuration. (#100651, @gjkim42)

    • New metrics: apiserver_kube_aggregator_x509_missing_san_total and apiserver_webhooks_x509_missing_san_total. This metric measures a number of connections to webhooks/aggregated API servers that use certificates without Subject Alternative Names. It being non-zero is a warning sign that these connections will stop functioning in the future since Golang is going to deprecate x509 certificate subject Common Names for server hostname verification. (#95396, @stlaz) [SIG API Machinery, Auth and Instrumentation]

    • Node Problem Detector is now available for GCE Windows nodes. (#101539, @jeremyje) [SIG Cloud Provider, Node and Windows]

    • Promote Cronjobs storage version to batch/v1. (#102363, @mengjiao-liu)

    • Promote CronJobControllerV2 flag to GA, with removal in 1.23. (#102529, @soltysh)

    • Promote EndpointSliceTerminatingCondition to Beta. This enables the terminating and serving conditions for EndpointSlice by default. (#103596, @andrewsykim)

    • Run etcd as non-root on GCE provider (#100635, @cindy52)

    • Scheduler nows provides an option for plugin developers to move Pods to activeQ. (#103383, @Huang-Wei)

    • Secret values are now masked by default in kubectl diff output. (#96084, @loozhengyuan)

    • Services with externalTrafficPolicy: Local now support graceful termination when using the iptables or ipvs mode of kube-proxy with EndpointSlices enabled. Specifically, if a connection for such a service arrives on a node when there are no “Ready” endpoints for the service, but there is at least one Terminating pod for that service on the node, then kube-proxy will send the traffic to the Terminating pod rather than dropping it. This patches up a race condition between when a pod is killed and when the external load balancer notices that it has been killed. (#97238, @andrewsykim)

    • Shell completion has been migrated to Cobra’s go solution. kubectl is now smarter about disabling file completion when it does not apply. Furthermore, completion for the cp command does not show all files unless the user has started typing something. (#96087, @marckhouzam) [SIG CLI]

    • Some of the in-tree storage drivers indicate support for the MetricsProvider interface, but fail to configure this for BlockMode volumes. With a recent change, Kubelet will call GetMetrics() for BlockMode volumes, and the in-tree drivers that miss the support cause a Go panic. Now the in-tree storage drivers that support BlockMode volumes, will return the Capacity of the volume in the GetMetrics() call. (#101587, @nixpanic)

    • Support FakeClientset match subresource. (#100939, @wzshiming)

    • The “Leader Migration” now support a wildcard component name and the default value. (#102711, @jiahuif)

    • The CSI driver supports the NodeServiceCapability VOLUME_MOUNT_GROUP and the DelegateFSGroupToCSIDriver feature gate is enabled, kubelet will delegate applying FSGroup to the driver by passing it to NodeStageVolume and NodePublishVolume, regardless of what other FSGroup policies are set, this is an alpha feature. (#103244, @verult)

    • The Memory Manager feature graduates to Beta and it is enabled by default. (#101947, @cynepco3hahue)

    • The BoundServiceAccountTokenVolume graduates to GA and thus will be unconditionally enabled. The feature gate is going to be removed in 1.23. (#101992, @zshihang)

    • The EmptyDir memory backed volumes are sized as the the minimum of pod allocatable memory on a host and an optional explicit user provided value. (#101048, @dims)

    • The HugePageStorageMediumSize feature graduates to GA and unconditionally enabled. Allowing unconditional usage of multiple sizes huge page resources on a container level. (#99144, @bart0sh)

    • The IngressClassNamespacedParams feature gate has graduated to beta and is enabled by default. This means IngressClass resource will now have two new fields - spec.paramters.namespace and spec.parameters.scope. (#101711, @hbagdi)

    • The LogarithmicScaleDown feature graduates to Beta and enabled by default. (#101767, @damemi)

    • The NamespaceDefaultLabelName is promoted to GA in this release. All Namespace API objects have a kubernetes.io/metadata.name label matching their metadata.name field to allow selecting any namespace by its name using a label selector. (#101342, @rosenhouse)

    • The ServiceInternalTrafficPolicy feature graduates to Beta and enable by default, which enables the internalTrafficPolicy field of Service by default. (#103462, @andrewsykim)

    • The ServiceLBNodePortControl graduates to Beta and is enabled by default. (#100412, @hanlins)

    • The SetHostnameAsFQDN graduates to GA and thus will be unconditionally disabled. (#101294, @javidiaz)

    • The WarningHeader feature is now GA and is unconditionally enabled. The apiserver_requested_deprecated_apis metric has graduated to stable status. The WarningHeader feature-gate is no longer operative and will be removed in v1.24. (#100754, @liggitt) [SIG API Machinery, Instrumentation and Testing]

    • The kubectl debug is able to create ephemeral containers in pre-1.22 clusters with the EphemeralContainers feature enabled. Note that versions of kubectl prior to 1.22 are unable to create ephemeral containers in clusters version 1.22 and greater due to an API change. (#103292, @verb)

    • The client-go credential plugins are now GA and are enabled by default. (#102890, @ankeesler)

    • The feature gate SSA graduated to GA in v1.22 and therefore is unconditionally enabled. (#100139, @Jefftree)

    • The job controller removes running pods when the number of completions is achieved. (#99963, @alculquicondor)

    • The kubeconfig is now exposed in the kube-scheduler framework handle. Out-of-tree plugins can leverage that to build CRD informers easily. (#100644, @Huang-Wei)

    • The new flag --chunk-size=SIZE for kubectl drain has been promoted to beta, and enabled by default. This flag may be used to alter the number of items or disable this feature when 0 is passed. (#100148, @KnVerey)

    • The new flag --chunk-size=SIZE has been added to kubectl describe. This flag may be used to alter the number of items or disable this feature when 0 is passed. (#101171, @KnVerey)

    • The pod resource API will provide memory manager metrics in the case when the memory manager feature gate is enabled, and the memory manager policy is static. (#101030, @cynepco3hahue)

    • The prefer nominated node graduates to Beta and enabld by default. (#102201, @chendave)

    • Update etcd version to 3.5.0-beta.3. (#102062, @serathius)

    • Update the Debian images to pick up CVE fixes in the base images:

      • Update the debian-base image to v1.7.0
      • Update the debian-iptables image to v1.6.1 (#102302, @xmudrii)
    • Update the setcap image to buster-v2.0.1. (#102377, @xmudrii)

    • Update the system-validators library to v1.5.0. Includes validation for seccomp and fixes a stdout/stderr problem in the Docker validator. (#103390, @ironyman)

    • Updates the following images to pick up CVE fixes:

      • debian to v1.8.0
      • debian-iptables to v1.6.5
      • setcap to v2.0.3 (#103235, @thejoycekung) [SIG API Machinery, Release and Testing]
    • Warnings for the use of deprecated and known-bad values in pod specs are now sent. (#101688, @liggitt)

    • Watch requests are now handled throttled by priority and fairness filter in kube-apiserver. (#102171, @wojtek-t)

    • You can use this Builder function to create events Field Selector (#101817, @cndoit18) [SIG API Machinery and Scalability]

    • Scheduler now registers event handlers dynamically. (#101394, @Huang-Wei)

    • kubectl: Enable using protocol buffers to request Metrics API. (#102039, @serathius)

    Documentation

    • The commandkubectl debug will now print a warning message when using the --target option since many container runtimes do not support this yet. (#101074, @verb)

    Failing Test

    • Fixes hostpath storage e2e tests within SELinux enabled env (#105786, @Elbehery) [SIG Testing]
    • Fixed generic ephemeal volumes with OwnerReferencesPermissionEnforcement admission plugin enabled. (#101186, @jsafrane)
    • Fixes kubectl drain --dry-run=server. (#100206, @KnVerey)
    • Fixes an overly restrictive conformance test to accept service account tokens signed by an ECDSA key (#100680, @smira) [SIG Architecture, Auth and Testing]
    • Fixes the should receive events on concurrent watches in same order conformance test to work properly on clusters that auto-create additional configmaps in namespaces. (#101950, @liggitt)
    • Resolves an issue with the “ServiceAccountIssuerDiscovery should support OIDC discovery” conformance test failing on clusters which are configured with issuers outside the cluster (#101589, @mtaufen) [SIG Auth and Testing]

    Other (Cleanup or Flake)

    • Updates konnectivity-network-proxy to v0.0.27. This includes a memory leak fix for the network proxy (#107187, @rata) [SIG API Machinery, Auth and Cloud Provider]

    Bug or Regression

    • An inefficient lock in EndpointSlice controller metrics cache has been reworked. Network programming latency may be significantly reduced in certain scenarios, especially in clusters with a large number of Services. (#107168, @robscott) [SIG Apps and Network]

    • Client-go: fix that paged list calls with ResourceVersionMatch set would fail once paging kicked in. (#107335, @fasaxc) [SIG API Machinery]

    • Fix a panic when using invalid output format in kubectl create secret command (#107346, @rikatz) [SIG CLI]

    • Fix: azuredisk parameter lowercase translation issue (#107429, @andyzhangx) [SIG Cloud Provider and Storage]

    • Fixes a rare race condition handling requests that timeout (#107459, @liggitt) [SIG API Machinery]

    • Mount-utils: Detect potential stale file handle (#107039, @andyzhangx) [SIG Storage]

    • A pod that the Kubelet rejects was still considered as being accepted for a brief period of time after rejection, which might cause some pods to be rejected briefly that could fit on the node. A pod that is still terminating (but has status indicating it has failed) may also still be consuming resources and so should also be considered. (#104918, @ehashman) [SIG Node]

    • Fix: skip instance not found when decoupling vmss from lb (#105836, @nilo19) [SIG Cloud Provider]

    • Kubeadm: allow the “certs check-expiration” command to not require the existence of the cluster CA key (ca.key file) when checking the expiration of managed certificates in kubeconfig files. (#106930, @neolit123) [SIG Cluster Lifecycle]

    • Kubeadm: during execution of the “check expiration” command, treat the etcd CA as external if there is a missing etcd CA key file (etcd/ca.key) and perform the proper validation on certificates signed by the etcd CA. Additionally, make sure that the CA for all entries in the output table is included - for both certificates on disk and in kubeconfig files. (#106925, @neolit123) [SIG Cluster Lifecycle]

    • Respect grace period when updating static pods. (#106394, @gjkim42) [SIG Node and Testing]

    • Reverts graceful node shutdown to match 1.21 behavior of setting pods that have not yet successfully completed to “Failed” phase if the GracefulNodeShutdown feature is enabled in kubelet. The GracefulNodeShutdown feature is beta and must be explicitly configured via kubelet config to be enabled in 1.21+. This changes 1.22 and 1.23 behavior on node shutdown to match 1.21. If you do not want pods to be marked terminated on node shutdown in 1.22 and 1.23, disable the GracefulNodeShutdown feature. (#106899, @bobbypage) [SIG Node]

    • Scheduler’s assumed pods have 2min instead of 30s to receive nodeName pod updates (#106633, @ahg-g) [SIG Scheduling]

    • EndpointSlice Mirroring controller now cleans up managed EndpointSlices when a Service selector is added (#106132, @robscott) [SIG Apps, Network and Testing]

    • Fix a bug that --disabled-metrics doesn’t function well. (#105793, @Huang-Wei) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Fix a panic in kubectl when creating secrets with an improper output type (#106356, @lauchokyip) [SIG CLI]

    • Fix concurrent map access causing panics when logging timed-out API calls. (#106112, @marseel) [SIG API Machinery]

    • Fix kube-proxy regression on UDP services because the logic to detect stale connections was not considering if the endpoint was ready. (#106239, @aojea) [SIG Network and Testing]

    • Fix scoring for NodeResourcesBalancedAllocation plugins when nodes have containers with no requests. (#106081, @ahmad-diaa) [SIG Scheduling]

    • Support more than 100 disk mounts on Windows (#105673, @andyzhangx) [SIG Storage and Windows]

    • The –leader-elect* CLI args are now honored correctly in scheduler. (#106130, @Huang-Wei) [SIG Scheduling]

    • The kube-proxy sync_proxy_rules_iptables_total metric now gives the correct number of rules, rather than being off by one.

      Fixed multiple iptables proxy regressions introduced in 1.22:

      • When using Services with SessionAffinity, client affinity for an endpoint now gets broken when that endpoint becomes non-ready (rather than continuing until the endpoint is fully deleted).

      • Traffic to a service IP now starts getting rejected (as opposed to merely dropped) as soon as there are no longer any usable endpoints, rather than waiting until all of the terminating endpoints have terminated even when those terminating endpoints were not being used.

      • Chains for endpoints that won’t be used are no longer output to iptables, saving a bit of memory/time/cpu. (#106373, @aojea) [SIG Network]

    • Watch requests that are delegated to aggregated apiservers no longer reserve concurrency units (seats) in the API Priority and Fairness dispatcher for their entire duration. (#105827, @benluddy) [SIG API Machinery]

    • Fix Job tracking with finalizers for more than 500 pods, ensuring all finalizers are removed before counting the Pod. (#104876, @alculquicondor) [SIG Apps]

    • Fix: skip case sensitivity when checking Azure NSG rules fix: ensure InstanceShutdownByProviderID return false for creating Azure VMs (#104446, @feiskyer) [SIG Cloud Provider]

    • Fixed occasional pod cgroup freeze when using cgroup v1 and systemd driver. (#104529, @kolyshkin) [SIG Node]

    • Fixes a regression that could cause panics in LRU caches in controller-manager, kubelet, kube-apiserver, or client-go EventSourceObjectSpamFilter (#104469, @liggitt) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation and Storage]

    • When using kubectl replace (or the equivalent API call) on a Service, the caller no longer needs to do a read-modify-write cycle to fetch the allocated values for .spec.clusterIP and .spec.ports[].nodePort. Instead the API server will automatically carry these forward from the original object when the new object does not specify them. (#104672, @thockin) [SIG Network]

    • Fix kube-apiserver metric reporting for the deprecated watch path of /api//watch/… (#104188, @wojtek-t) [SIG API Machinery and Instrumentation]

    • Kube-proxy: delete stale conntrack UDP entries for loadbalancer ingress IP. (#104009, @aojea) [SIG Network]

    • Pass additional flags to subpath mount to avoid flakes in certain conditions (#104346, @mauriciopoppe) [SIG Storage]

    • Added jitter factor to lease controller that better smears load on kube-apiserver over time. (#101652, @marseel) [SIG API Machinery and Scalability]

    • Added privileges for EndpointSlice to the default view & edit RBAC roles. (#101203, @mtougeron)

    • After DBus restarts, make GracefulNodeShutdown work again (#100369, @wzshiming)

    • Aggregate errors when putting vmss. (#98350, @nilo19)

    • Aggregate write permissions on events to users with edit and admin role. (#102858, @tumido)

    • Aggregated roles no longer include write access to EndpointSlices. This rolls back part of a change that was introduced earlier in the Kubernetes 1.22 cycle. (#103703, @robscott)

    • Applying fix for not deleting existing public IP when a service is deleted in Azure. (#100694, @nilo19)

    • Applying fix for not tagging static public IP. (#101752, @nilo19)

    • Applying fix so that deleting non-existing disk returns success. (#102083, @andyzhangx)

    • Applying fix: cleanup outdated routes. (#102935, @nilo19)

    • Avoid caching the Azure VMSS instances whose network profile is nil (#100948, @feiskyer) [SIG Cloud Provider]

    • Azure: Avoid setting cached Sku when updating VMSS and VMSS instances. (#102005, @feiskyer)

    • Azurefile: Normalize share name to not include the capital letters (#100731, @kassarl)

    • Chain the field manager creation calls in newDefaultFieldManager to be explicit about the order of operations. (#101076, @kevindelgado)

    • Disruption controller shouldn’t error while syncing for unmanaged pods. (#103414, @ravisantoshgudimetla) [SIG Apps and Testing]

    • Ensure service is deleted when the Azure resource group has been deleted. (#100944, @feiskyer)

    • Ensures ExecProbeTimeout=false kubelet feature gate with dockershim is taken into account, when the exec probe takes longer than timeoutSeconds configuration. (#100200, @jackfrancis)

    • Expose rest_client_rate_limiter_duration_seconds metric to component-base to track client side rate limiter latency in seconds. Broken down by verb and URL. (#100311, @IonutBajescu) [SIG API Machinery, Cluster Lifecycle and Instrumentation]

    • Fire an event when failing to open NodePort. (#100599, @masap)

    • Fix Azure node public IP fetching issues from instance metadata service when the node is part of standard load balancer backend pool. (#100690, @feiskyer) [SIG Cloud Provider]

    • Fix EndpointSlice describe panic when an Endpoint doesn’t have zone. (#101025, @tnqn)

    • Fix kubectl set env or resources not working for initcontainers. (#101669, @carlory)

    • Fix kubectl alpha debug node does not work on tainted(NoExecute) nodes and tolerate everything. (#98431, @wawa0210)

    • Fix a bug on the endpointslicemirroring controller where endpoint NotReadyAddresses were mirrored as Ready to the corresponding EndpointSlice. (#102683, @aojea)

    • Fix a bug that a preemptor pod may exist as a phantom in the scheduler. (#102498, @Huang-Wei)

    • Fix a number of race conditions in the kubelet when pods are starting up or shutting down that might cause pods to take a long time to shut down. (#102344, @smarterclayton) [SIG Apps, Node, Storage and Testing]

    • Fix an issue with kubectl on certain older version of Windows or when legacy console mode is enabled on Windows 8 which causes kubectl exec to crash. (#102825, @n4j)

    • Fix availability set cache in vmss cache (#100110, @CecileRobertMichon) [SIG Cloud Provider]

    • Fix how nulls are handled in array and objects in json patches. (#102467, @pacoxu)

    • Fix panic when kubectl create ingress has annotation flag and an empty value set. (#101377, @rikatz)

    • Fix performance regression for update and apply operations on large CRDs. (#103318, @jpbetz) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation and Storage]

    • Fix raw block mode CSI NodePublishVolume stage miss pod info. (#99069, @phantooom)

    • Fix resource enforcement when using systemd cgroup driver (#102147, @kolyshkin)

    • Fix rounding of volume storage requests. (#100100, @maxlaverse)

    • Fix runtime container status for PostStart hook error. (#100608, @pacoxu)

    • Fix scoring for NodeResourcesMostAllocated and NodeResourcesBalancedAllocation plugins when nodes have containers with no requests. This was leaving to under-utilization of small nodes. (#102925, @alculquicondor)

    • Fix the code is leaking the defaulting between unrelated pod instances. (#103284, @kebe7jun) [SIG CLI]

    • Fix winkernel kube-proxy to only use dual stack when host and networking supports it (#101047, @jsturtevant) [SIG Network and Windows]

    • Fix: Azure file inline volume namespace issue in CSI migration translation (#101235, @andyzhangx)

    • Fix: Bug in kube-proxy latency metrics to calculate only the latency value for the Endpoints that are created after it starts running. This is needed because all the Endpoints objects are processed on restarts, independently when they were. (#100861, @aojea)

    • Fix: avoid nil-pointer panic when checking the frontend IP configuration (#101739, @nilo19) [SIG Cloud Provider]

    • Fix: display of Job completion mode in kubectl describe. (#101160, @alculquicondor)

    • Fix: return empty VMAS name if using standalone VM (#103470, @nilo19) [SIG Cloud Provider]

    • Fix: set “host is down” as corrupted mount. When SMB server is down, there is no way to terminate pod which is using SMB mount, would get an error. (#101398, @andyzhangx)

    • Fix: using NVMe AWS EBS volumes partitions. (#100500, @jsafrane)

    • Fixed ‘kubelet’ runtime panic for timed-out portforward streams. (#102489, @saschagrunert)

    • Fixed SELinux relabeling of CSI volumes after CSI driver failure. (#103154, @jsafrane) [SIG Node and Storage]

    • Fixed garbage collection of dangling VolumeAttachments for PersistentVolumes migrated to CSI on startup of kube-controller-manager. (#102176, @timebertt)

    • Fixed port-forward memory leak for long-running and heavily used connections. (#99839, @saschagrunert)

    • Fixed a bug due to which the controller was not populating the lastSuccessfulTime field added to cronjob.status in batch/v1. (#102642, @alaypatel07)

    • Fixed a bug that kubectl create configmap always returns zero exit code when failed. (#101780, @nak3) [SIG CLI]

    • Fixed a bug that scheduler extenders are not called on preemptions. (#103019, @ordovicia)

    • Fixed a bug where startupProbe stopped working after a container’s first restart. (#101093, @wzshiming)

    • Fixed an issue blocking azure auth to prompt to device code authentication flow when refresh token expires. (#102063, @tdihp)

    • Fixed false-positive uncertain volume attachments, which led to unexpected detachment of CSI migrated volumes (#101737, @Jiawei0227) [SIG Apps and Storage]

    • Fixed mounting of NFS volumes when IPv6 address is used as a server. (#101067, @Elbehery) [SIG Storage]

    • Fixed starting new pods after previous pod timed out unmounting its volumes. (#100183, @jsafrane)

    • Fixed very rare volume corruption when a pod is deleted while kubelet is offline. (#102059, @jsafrane)

    • Fixes a data race issue in the priority and fairness API server filter. (#100638, @tkashem)

    • Fixes issue with websocket-based watches of Service objects not closing correctly on timeout. (#102539, @liggitt)

    • For kubeadm: support for custom imagetags for etcd images which contain build metadata, when imagetags are in the form of version_metadata. For instance, if the etcd version is v3.4.13+patch.0, the supported imagetag would be v3.4.13_patch.0 (#100350, @jr0d)

    • For vSphere: fix regression during attach disk if datastore is within a storage folder or datastore cluster. (#102892, @gnufied)

    • GCE Windows clusters have their TCP/IP parameters are set to GCE’s recommended values. (#103057, @jeremyje) [SIG Cloud Provider and Windows]

    • GCE Windows will no longer install Docker on containerd nodes. (#101747, @jeremyje) [SIG Cloud Provider and Windows]

    • Generated OpenAPI now correctly specifies 201 as a possible response code for PATCH operations. (#100141, @brendandburns)

    • Graceful termination will now be honored when deleting a collection of pods. (#100101, @deads2k)

    • If kube-proxy mode is userspace do not enable EndpointSlices. (#100913, @JornShen)

    • Kubeadm: allow passing the flag --log-file if --config is passed. If you wish to log to a file you must also pass --logtostderr=false or --alsologtostderr=true. Alternatively you can pipe to a file using “kubeadm … | tee …”. (#101449, @CaoDonghui123)

    • Kubeadm: enable --experimental-patches flag for kubeadm join phase control-plane-join all command. (#101110, @SataQiu)

    • Kubeadm: fix a bug where kubeadm join for control plane nodes would download certificates and keys from the cluster, but would not write publicly readable certificates and public keys with mode 0644 and instead use mode 0600. (#103313, @neolit123)

    • Kubeadm: fix the bug that kubeadm only uses the first hash in caCertHashes to verify the root CA. (#101977, @SataQiu)

    • Kubeadm: remove the “ephemeral_storage” request from the etcd static pod that kubeadm deploys on stacked etcd control plane nodes. This request has caused sporadic failures on some setups due to a problem in the kubelet with cadvisor and the LocalStorageCapacityIsolation feature gate. See this issue for more details: https://github.com/kubernetes/kubernetes/issues/99305 (#102673, @jackfrancis) [SIG Cluster Lifecycle]

    • Kubeadm: when using a custom image repository for CoreDNS kubeadm now will append the coredns image name instead of coredns/coredns, thus restoring the behaviour existing before the v1.21 release. Users who rely on nested folder for the coredns image should set the clusterConfiguration.dns.imageRepository value including the nested path name (e.g using registry.company.xyz/coredns will force kubeadm to use registry.company.xyz/coredns/coredns image). No action is needed if using the default registry (k8s.gcr.io). (#102502, @ykakarap)

    • Kubelet: improve the performance when waiting for a synchronization of the node list with the kube-apiserver. (#99336, @neolit123)

    • Kubelet: the returned value for PodIPs is the same in the Downward API and in the pod.status.PodIPs field (#103307, @aojea)

    • Limit vSphere volume name to 63 characters long. (#100404, @gnufied)

    • Logging for GCE Windows clusters will be more accurate and complete when using Fluent bit. (#101271, @jeremyje)

    • Metrics Server will use Addon Manager 1.8.3 (#103541, @jbartosik) [SIG Cloud Provider and Instrumentation]

    • Output for kubectl describe podsecuritypolicy is now kind specific and cleaner (#101436, @KnVerey)

    • Parsing of cpuset information now properly detects more invalid input such as 1--3 or 10-6. (#100565, @lack)

    • Pods that are known to the kubelet to have previously been Running should not revert to Pending state, the kubelet will now infer a termination. (#102821, @ehashman)

    • Prevent Kubelet stuck in DiskPressure when imagefs.minReclaim is set (#99095, @maxlaverse)

    • Reduces delay initializing on non-AWS platforms docker runtime. (#93260, @nckturner) [SIG Cloud Provider]

    • Register/Deregister Targets in chunks for AWS TargetGroup (#101592, @M00nF1sh) [SIG Cloud Provider]

    • Removed /sbin/apparmor_parser requirement for the AppArmor host validation. This allows using AppArmor on distributions which ship the binary in a different path. (#97968, @saschagrunert) [SIG Node and Testing]

    • Renames the timeout field for the DelegatingAuthenticationOptions to TokenRequestTimeout and set the timeout only for the token review client. Previously the timeout was also applied to watches making them reconnecting every 10 seconds. (#100959, @p0lyn0mial)

    • Reorganized iptables rules to reduce rules in KUBE-SERVICES and KUBE-NODEPORTS. (#96959, @tssurya)

    • Respect annotation size limit for server-side apply updates to the client-side apply annotation. Also, fix opt-out of this behavior by setting the client-side apply annotation to the empty string. (#102105, @julianvmodesto) [SIG API Machinery]

    • Retry FibreChannel devices cleanup after error to ensure FibreChannel device is detached before it can be used on another node. (#101862, @jsafrane)

    • Support correct sorting for cpu, memory, storage, ephemeral-storage, hugepages, and attachable-volumes. (#100435, @lauchokyip)

    • Switch scheduler to generate the merge patch on pod status instead of the full pod (#103133, @marwanad) [SIG Scheduling]

    • The EndpointSlice IP validation now matches Endpoints IP validation. (#101084, @robscott)

    • The kube-apiserver now reports the synthetic verb when logging requests, better explaining the user intent and matching what is reported in the metrics. (#102934, @lavalamp)

    • The kube-controller-manager' sets the upper-bound timeout limit for outgoing requests to 70s. Previously (#99358, @p0lyn0mial)

    • The kube-proxy log now shows the “Skipping topology aware endpoint filtering since no hints were provided for zone” warning under the right conditions. (#101857, @dervoeti)

    • The kubectl create service now respects the namespace flag. (#101005, @zxh326)

    • The kubectl get now truncates multi-line strings to avoid breaking printing (#103514, @soltysh)

    • The kubectl wait --for=delete command now ignores the not found error correctly. (#96702, @lingsamuel)

    • The kubelet now reports distinguishes log messages about certificate rotation for its client cert and server cert separately to make debugging problems with one or the other easier. (#101252, @smarterclayton)

    • The serviceOwnsFrontendIP shouldn’t report error when the public IP doesn’t match. (#102516, @nilo19)

    • The system:aggregate-to-edit role no longer includes write access to the Endpoints API. For new Kubernetes 1.22 clusters, the edit and admin roles will no longer include that access in newly created Kubernetes 1.22 clusters. This will have no affect on existing clusters upgrading to Kubernetes 1.22. To retain write access to Endpoints in the aggregated edit and admin roles for newly created 1.22 clusters, refer to https://github.com/kubernetes/website/pull/29025. (#103704, @robscott) [SIG Auth and Network]

    • The conformance tests:

      • Services should serve multiport endpoints from pods
      • Services should serve a basic endpoint from pods were only validating the API objects, not performing any validation on the actual Services implementation. Those tests now validate that the Services under test are able to forward traffic to the endpoints. (#101709, @aojea) [SIG Network and Testing]
    • The current behavior for Services that IPFamilyPolicy set as PreferDualstack. The current behavior when the cluster is upgraded to dual-stack is:

      • Services that have been set to IPFamilyPolicy = PreferDualstack will be upgraded when the service object is updated. e.g., when a user change a label.

      This behavior will change to:

      • Services that have been set IPFamilyPolicy = PreferDualstack will not be upgraded when the service object is updated. User can still change policy, type etc and existing behaviors remain the same. (#102898, @khenidak) [SIG Network and Testing]
    • The reason and message fields for pod status are no longer reset unless the phase also changes. (#103785, @smarterclayton) [SIG Node]

    • Treat VSphere “File (vmdk path here) was not found” errors as success during volume deletion (#92372, @breunigs) [SIG Cloud Provider and Storage]

    • Update kube-proxy base image debian-iptables to v1.6.2 to pickup documentation \n"- debian-iptables: select nft mode if ntf lines > legacy lines, matching iptables-wrappers" (#102590, @BenTheElder)

    • Update klog v2.9.0. (#102332, @pacoxu)

    • Updated the Graceful Node Shutdown Pod termination reason and message. Updated the Graceful Node Shutdown Pod rejection reason and message. (#102840, @Kissy)

    • Updates dependency sigs.k8s.io/structured-merge-diff to v4.1.1. (#100784, @kevindelgado)

    • Updates hostprocess tests to specify user. (#102965, @jsturtevant)

    • Upgrades functionality of kubectl kustomize as described at https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv4.2.0 (#103419, @natasha41575) [SIG CLI]

    • Upgrades functionality of kubectl kustomize as described at kustomize/v4.1.2 (#101120, @monopole)

    • Upgrading etcd: kubeadm upgrade etcd to 3.4.13-3 (#100612, @pacoxu)

    • Use default timeout of 10s for Azure ACR credential provider. (#100686, @hasheddan) [SIG Cloud Provider]

    • We no longer allow the cluster operator to delete any suggested priority & fairness bootstrap configuration object. If a cluster operator removes a suggested configuration, it will be restored by the apiserver. (#102067, @tkashem)

    • When DisableAcceleratorUsageMetrics is set, do not collect accelerator metrics using cAdvisor. (#101712, @SergeyKanzhelev) [SIG Instrumentation and Node]

    • YAML documents separators ("—") can now be followed by whitespace and comments ("# ….") on the same line. This fixes a bug where documents starting with a comment after the separator were ignored. Other types of content on the same line will result in an error. (#103457, @codearky) [SIG API Machinery]

    • oc describe quota used has the same unit format as hard (#102177, @atiratree) [SIG CLI]

    Other (Cleanup or Flake)

    • Kube-apiserver: sets an upper-bound on the lifetime of idle keep-alive connections and time to read the headers of incoming requests (#103958, @liggitt) [SIG API Machinery and Node]
    • After the deprecation period,now the Kubelet’s --chaos-chance flag are removed. (#101057, @wangyysde) [SIG Node]
    • Allow CSI drivers to just run offline expansion tests. (#102665, @gnufied)
    • Changed buildmode of non static Kubernetes binaries to produce position independent executables (PIE). (#102323, @saschagrunert)
    • Clarified the description of a test in the e2e suite that mentions “SCTP” but is actually intended to be testing the behavior of network plugins that don’t implement SCTP. (#102509, @danwinship)
    • Client-go: reduce verbosity of Starting/Stopping reflector messages to 3 again. (#102788, @pohly)
    • Disable log sampling when using json logging format. (#102620, @serathius)
    • Exposes WithCustomRoundTripper method for specifying a middleware function for custom HTTP behaviour for the delegated auth clients. (#99775, @p0lyn0mial)
    • Fake clients now implement a FakeClient interface (#100940, @markusthoemmes) [SIG API Machinery and Instrumentation]
    • Featuregate ServiceLoadBalancerClass graduates to Beta and is enables by default. (#103129, @XudongLiuHarold)
    • Improve func ToSelectableFields’ performance for event. (#102461, @goodluckbot)
    • Increased CSINodeIDMaxLength from 128 bytes to 192 bytes. Prepare to increase the length limit to 256 bytes in 1.23 release. (#101256, @Jiawei0227)
    • JSON logging now supports having information about source code location in the logging format, source code information is available under the key “caller”. (#102437, @MadhavJivrajani)
    • Kubeadm: move the BootstrapToken* API and related utilities from v1beta3 to a separate API group/version - bootstraptoken/v1. (#102964, @neolit123) [SIG Cluster Lifecycle]
    • Kubeadm: the CriticalAddonsOnly toleration has been removed from kube-proxy DaemonSet (#101966, @SataQiu) [SIG Cluster Lifecycle]
    • Metrics Server updated to use 0.4.4 image that doesn’t depend on deprecated authorization.k8s.io/v1beta1 subjectaccessreviews API version. (#101477, @x13n)
    • Migrate proxy/ipvs/proxier.go logs to structured logging. (#97796, @JornShen)
    • Migrate staging/src/k8s.io/apiserver/pkg/registry logs to structured logging. (#98287, @lala123912)
    • Migrate some log messages to structured logging in pkg/volume/plugins.go. (#101510, @huchengze)
    • Migrate some log messages to structured logging in pkg/volume/volume_linux.go. (#99566, @huchengze)
    • Official binaries now include the golang generated build ID buildid instead of an empty string. (#101411, @saschagrunert)
    • Remove balanced attached node volumes feature. (#102443, @ravisantoshgudimetla)
    • Remove deprecated --generator flag from kubectl autoscale. (#99900, @MadhavJivrajani)
    • Remove the deprecated flag --generator from kubectl create deployment command. (#99915, @BLasan)
    • Remove the duplicate packet import. (#101187, @chuntaochen)
    • Replace go-bindata with //go:embed. (#99829, @palnabarun)
    • The DynamicFakeClient now exposes its tracker via a Tracker() function. (#100085, @markusthoemmes)
    • The VolumeSnapshotDataSource feature gate that is GA since v1.20 is unconditionally enabled, and can no longer be specified via the --feature-gates argument. (#101531, @ialidzhikov) [SIG Storage]
    • The deprecated CRIContainerLogRotation feature-gate has been removed, since the CRIContainerLogRotation feature graduated to GA in 1.21 and was unconditionally enabled. (#101578, @carlory)
    • The deprecated RootCAConfigMap feature-gate has been removed, since the RootCAConfigMap feature graduated to GA in 1.21 and is unconditionally enabled. (#101579, @carlory)
    • The deprecated runAsGroup feature-gate has been removed, since the runAsGroup feature graduated to GA in 1.21. (#101581, @carlory)
    • The etcd client has been updated to 3.5.0; github.com/golang/protobuf, google.golang.org/protobuf, and google.golang.org/grpc have been updated to current versions. (#100488, @liggitt)
    • Update Azure Go SDK to v55.0.0. (#102441, @feiskyer)
    • Update Azure Go SDK version to v53.1.0 (#101357, @feiskyer) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle and Instrumentation]
    • Update CNI plugins to v0.9.1. (#102328, @lentzi90)
    • Update Calico to v3.19.1. (#102386, @JornShen)
    • Update cri-tools dependency to v1.21.0. (#100956, @saschagrunert)
    • Update dep google/gnostic and google/go-cmp to v0.5.5 and updating transitive dependencies protobuf. (#102783, @mcbenjemaa)
    • Update golang.org/x/net to v0.0.0-20210520170846-37e1c6afe023 (#103176, @CaoDonghui123) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Node and Storage]
    • Updated command descriptions and examples for grammar and punctuation consistency. (#103524, @bergerhoffer) [SIG Auth and CLI]
    • Updated pause image to version 3.5, which now runs per default as pseudo user and group 65535:65535. This does not have any effect on remote container runtimes like CRI-O and containerd, which setup the pod sandbox user and group on their own. (#100292, @saschagrunert)
    • Upgrade functionality of kubectl kustomize as described at kustomize/v4.1.3. (#102193, @gautierdelorme)

    Dependencies

    Added

    • github.com/antihax/optional: v1.0.0
    • github.com/benbjohnson/clock: v1.0.3
    • github.com/bits-and-blooms/bitset: v1.2.0
    • github.com/certifi/gocertifi: 2c3bb06
    • github.com/checkpoint-restore/go-criu/v5: v5.0.0
    • github.com/cncf/udpa/go: 5459f2c
    • github.com/cockroachdb/errors: v1.2.4
    • github.com/cockroachdb/logtags: eb05cc2
    • github.com/coredns/caddy: v1.1.0
    • github.com/felixge/httpsnoop: v1.0.1
    • github.com/frankban/quicktest: v1.11.3
    • github.com/getsentry/raven-go: v0.2.0
    • github.com/go-kit/log: v0.1.0
    • github.com/gofrs/uuid: v4.0.0+incompatible
    • github.com/josharian/intern: v1.0.0
    • github.com/jpillora/backoff: v1.0.0
    • github.com/nxadm/tail: v1.4.4
    • github.com/opentracing/opentracing-go: v1.1.0
    • github.com/robfig/cron/v3: v3.0.1
    • github.com/stoewer/go-strcase: v1.2.0
    • go.etcd.io/etcd/api/v3: v3.5.0
    • go.etcd.io/etcd/client/pkg/v3: v3.5.0
    • go.etcd.io/etcd/client/v2: v2.305.0
    • go.etcd.io/etcd/client/v3: v3.5.0
    • go.etcd.io/etcd/pkg/v3: v3.5.0
    • go.etcd.io/etcd/raft/v3: v3.5.0
    • go.etcd.io/etcd/server/v3: v3.5.0
    • go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc: v0.20.0
    • go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp: v0.20.0
    • go.opentelemetry.io/contrib: v0.20.0
    • go.opentelemetry.io/otel/exporters/otlp: v0.20.0
    • go.opentelemetry.io/otel/metric: v0.20.0
    • go.opentelemetry.io/otel/oteltest: v0.20.0
    • go.opentelemetry.io/otel/sdk/export/metric: v0.20.0
    • go.opentelemetry.io/otel/sdk/metric: v0.20.0
    • go.opentelemetry.io/otel/sdk: v0.20.0
    • go.opentelemetry.io/otel/trace: v0.20.0
    • go.opentelemetry.io/otel: v0.20.0
    • go.opentelemetry.io/proto/otlp: v0.7.0
    • go.uber.org/goleak: v1.1.10

    Changed

    Removed

    • github.com/agnivade/levenshtein: v1.0.1
    • github.com/alecthomas/template: fb15b89
    • github.com/andreyvit/diff: c7f18ee
    • github.com/bifurcation/mint: 93c51c6
    • github.com/caddyserver/caddy: v1.0.3
    • github.com/cenkalti/backoff: v2.1.1+incompatible
    • github.com/checkpoint-restore/go-criu/v4: v4.1.0
    • github.com/cheekybits/genny: 9127e81
    • github.com/go-acme/lego: v2.5.0+incompatible
    • github.com/go-bindata/go-bindata: v3.1.1+incompatible
    • github.com/go-openapi/analysis: v0.19.5
    • github.com/go-openapi/errors: v0.19.2
    • github.com/go-openapi/loads: v0.19.4
    • github.com/go-openapi/runtime: v0.19.4
    • github.com/go-openapi/spec: v0.19.5
    • github.com/go-openapi/strfmt: v0.19.5
    • github.com/go-openapi/validate: v0.19.8
    • github.com/gobuffalo/here: v0.6.0
    • github.com/hpcloud/tail: v1.0.0
    • github.com/jimstudt/http-authentication: 3eca13d
    • github.com/klauspost/cpuid: v1.2.0
    • github.com/kr/logfmt: b84e30a
    • github.com/kylelemons/godebug: d65d576
    • github.com/lucas-clemente/aes12: cd47fb3
    • github.com/lucas-clemente/quic-clients: v0.1.0
    • github.com/lucas-clemente/quic-go-certificates: d2f8652
    • github.com/lucas-clemente/quic-go: v0.10.2
    • github.com/markbates/pkger: v0.17.1
    • github.com/marten-seemann/qtls: v0.2.3
    • github.com/mholt/certmagic: 6a42ef9
    • github.com/naoina/go-stringutil: v0.1.0
    • github.com/naoina/toml: v0.1.1
    • github.com/robfig/cron: v1.1.0
    • github.com/satori/go.uuid: v1.2.0
    • github.com/thecodeteam/goscaleio: v0.1.0
    • github.com/tidwall/pretty: v1.0.0
    • github.com/vektah/gqlparser: v1.1.2
    • github.com/willf/bitset: v1.1.11
    • go.etcd.io/etcd: dd1b699
    • go.mongodb.org/mongo-driver: v1.1.2
    • gopkg.in/cheggaaa/pb.v1: v1.0.25
    • gopkg.in/fsnotify.v1: v1.4.7
    • gopkg.in/mcuadros/go-syslog.v2: v2.2.1
    • gopkg.in/resty.v1: v1.12.0
    • k8s.io/heapster: v1.2.0-beta.1

    containerlinux 3033.2.2

    Breaking changes

    • CGroupsV2 are enabled by default. Applications might need to be updated if they don’t have support. There are several known issues:
      • Java applications must use JRE >= 15; Please see OpenJDK upstream issue for more details.

    Security fixes

    Bug fixes

    • SDK: Fixed build error popping up in the new SDK Container because policycoreutils used the wrong ROOT to update the SELinux store (flatcar-linux/coreos-overlay#1502)
    • Fixed leak of SELinux policy store to the root filesystem top directory due to wrong store path in policycoreutils instead of /var/lib/selinux (flatcar-linux/Flatcar#596)
    • Ensured that the /run/xtables.lock coordination file exists for modifications of the xtables backend from containers (must be bind-mounted) or the iptables-legacy binaries on the host (flatcar-linux/init#57)
    • dev container: Fix github URL for coreos-overlay and portage-stable to use repos from flatcar-linux org directly instead of relying on redirects from the kinvolk org. This fixes checkouts with emerge-gitclone inside dev-container. (flatcar-linux/scripts#194)
    • SDK: Fixed build error popping up in the new SDK Container because policycoreutils used the wrong ROOT to update the SELinux store (flatcar-linux/coreos-overlay#1502)
    • arm64: the Polkit service does not crash anymore. (flatcar-linux/Flatcar#156)
    • toolbox: fixed support for multi-layered docker images (toolbox#5)
    • Run emergency.target on ignition/torcx service unit failure in dracut (bootengine#28)
    • Fix vim warnings on missing file, when built with USE=”minimal” (portage-stable#260)
    • The Torcx profile docker-1.12-no got fixed to reference the current Docker version instead of 19.03 which wasn’t found on the image, causing Torcx to fail to provide Docker (PR#1456)
    • Use https protocol instead of git for Github URLs (flatcar-linux/coreos-overlay#1394)

    Changes

    • Backported elf support for iproute2 (flatcar-linux/coreos-overlay#1256)
    • Added GPIO support (coreos-overlay#1236)
    • Enabled SELinux in permissive mode on ARM64 (coreos-overlay#1245)
    • The iptables command uses the nftables kernel backend instead of the iptables backend, you can also migrate to using the nft tool instead of iptables. Containers with iptables binaries that use the iptables backend will result in mixing both kernel backends which is supported but you have to look up the rules separately (on the host you can use the iptables-legacy and friends).
    • Added missing SELinux rule as initial step to resolve Torcx unpacking issue (coreos-overlay#1426)

    Updates

    calico 3.21.3

    BGP Improvements

    For users of BGP you can now view the status of your BGP routers, including session status, RIB / FIB contents, and agent health via the new CalicoNodeStatus API. See the API documentation for more details.

    In addition, you can control BGP advertisement of certain prefixes using the new disableBGPExport option on each IP pool, allowing greater control of your route sharing scheme.

    Pull requests:

    • Added Calico node status resource (CalicoNodeStatus) which represents a collection of status information for a node that Calico reports back to the user for use during troubleshooting. libcalico-go #1502 (@song-jiang)
    • Report node BGP status from calico/node. node #1234 (@song-jiang)
    • Add new syncer for BGP status API. typha #662 (@song-jiang)
    • Don’t export BGP routes for IP pools that have disableBGPExport==true confd #647 (@coutinhop)

    Service-based network policy improvements

    In v3.20, we introduced egress policy rules that can match on Kubernetes services. In v3.21, we improved upon that in two ways. First, you can now use service matches in Calico NetworkPolicy and GlobalNetworkPolicy ingress rules. Second, you can now use service-based network policy rules on Windows nodes.

    Pull requests:

    • Policy ingress rules now support service selectors. felix #3024 (@mgleung)
    • Windows data plane support for Service-based network policy rules felix #2917 (@caseydavenport)
    • Allow services to be specified in the Source field of Ingress rules libcalico-go #1517 (@mgleung)

    Option to run Calico as non-privileged and non-root

    Calico can now optionally run in non-privileged and non-root mode, with some limitations. See the documentation for more information.

    Pull requests:

    • Change node and supporting binary permissions so that they can be run as a non-root user node #1224 (@mgleung)
    • CNI plugin now sets route_localnet=1 for container interfaces cni-plugin #1168 (@mgleung)
    • CNI plugins now have SUID bit set in order to run as non-root cni-plugin #1168 (@mgleung)

    IPReservations API

    You can use the new IPReservations API to reserve certain IP addresses so that they will not be used by Calico IPAM. This allows for fine-grained control of the IP space in your cluster.

    Pull requests:

    • Add support for IPReservations libcalico-go #1509 (@fasaxc)

    Bug fixes

    • Fix a serious regression introduced in v3.21.0 where the datastore watcher could get stuck and report stale information in clusters with >500 policies/pods/etc. The bug was triggered by needing to do a resync (for example after an etcd compaction) when there were enough resources to trigger the list pager. calico #5332 (@robbrockbank)
    • Pass ExceptUpgradeService param to stop-calico.ps1 as well node #1372 (@lmm)
    • Restrict Typha server to FIPS compliant cipher suites. typha #696 (@caseydavenport)
    • Fix log spam from Calico upgrade service for Windows node #1343 (@song-jiang)
    • Increase timeout for setting NetworkUnavailable on shutdown node #1341 (@caseydavenport)
    • Fix potential panic and memory leak in kube-controllers caused by adding and subsequently deleting IPAM blocks kube-controllers #912 (@caseydavenport)
    • IPAM GC correctly handles multiple IP addresses allocated with the same handle ID. kube-controllers #903 (@caseydavenport)
    • Fix bug where invalid port structures were being sent to Felix, preventing pods with hostPorts specified from working. libcalico-go #1545 (@caseydavenport)
    • Downgrade repetitive info level logging in calico/node autodetection code node #1237 (@caseydavenport)
    • Updated ubi base images and CentOS repos to stop CVE false positives from being reported. node #1136 (@coutinhop)
    • Fixed typo in umount command pod2daemon #64 (@ScheererJ)
    • Fixes this bug which caused WireGuard stats to be collected even when WireGuard was disabled. Additionally, the version of the wgctrl dependency has been updated as the previous version caused thread leaks. felix #3057 (@mikestephen)
    • Fix blackhole route table interface matches to handle empty interface regexes. felix #3007 (@robbrockbank)
    • Fix slow performance when updating a Kubernetes namespace when there are many Pods (and in turn, slow startup performance when there are many namespaces). felix #2964 (@fasaxc)
    • Close race condition that could result in an extra IPAM block being allocated to a node. libcalico-go #1488 (@caseydavenport)
    • Fix that podIP annotation could be incorrectly clobbered for stateful set pods: https://github.com/projectcalico/calico/issues/4710 libcalico-go #1472 (@fasaxc)
    • Fix removal of old CNI configuration on name-change cni-plugin #1153 (@caseydavenport)
    • Readiness depends on all syncers typha #613 (@robbrockbank)
    • Exclude RR nodes from BGP full mesh confd #619 (@coutinhop)
    • Fixed a bug in ExternalTrafficPolicy=Local that lead to connection stalling. felix #3015 (@tomastigera)
    • Fixed broken connections when client used the same port to connect to the same backed via a nodeport on different nodes. felix #2983 (@tomastigera)
    • The eBPF mode implementation of DoNotTrack policy was incorrectly allowing an inbound connection through a HostEndpoint, when the HostEndpoint had DoNotTrack policy for the ingress direction but not for egress. For precise compatibility with Calico’s established DoNotTrack semantics, that connection should be disallowed, and now is. (Because of the lack of connection tracking, successful use of DoNotTrack policy to allow flows requires configuring the DoNotTrack policy symmetrically in both directions.) felix #2982 (@neiljerram)

    Other changes

    • Replace github.com/dgrijalva/jwt-go with active fork github.com/golang-jwt/jwt that resolves vulnerability flagged by scanners. libcalico-go #1554 (@lmm)
    • calico/node logs write to /var/log/calico within the container by default, in addition to stdout node #1133 (@song-jiang)
    • Read pod IP information from Amazon VPC CNI annotation, if present on the pod. libcalico-go #1523 (@caseydavenport)
    • Update etcd client version to v3.5.0 libcalico-go #1495 (@Aceralon)
    • Optimize lists and watches made against the Kubernetes API libcalico-go #1484 (@caseydavenport)
    • WorkloadEndpoints now support hostPorts libcalico-go #1471 (@AloysAugustin)
    • Include CNI plugin release v1.0.0 cni-plugin #1141 (@caseydavenport)
    • Allow configuration of num_queues for Calico created veth interfaces cni-plugin #1116 (@arikachen)
    • Typha now gives newly connected clients an extra grace period to catch up after sending the snapshot to reduce the possibility of cyclic disconnects. typha #614 (@fasaxc)
    • Add calico-node upgrade service for upgrades on Windows node #1254 (@lmm)
    • eBPF arm64/aarch64 node #1044 (@frozenprocess)
    • BPF: Endpoints in EndpointsSlices that are not ready are excluded from NAT felix #3017 (@tomastigera)
    • Calico’s eBPF dataplane now fully implements DoNotTrack policy felix #2910 (@neiljerram)
    • Add HostPort support in the gRPC dataplane cni-plugin #1119 (@AloysAugustin)

    app-operator 5.6.0

    Added

    • Support watching app CRs in organization namespace with cluster label selector.

    Changed

    • Get tarball URL for chart CRs from index.yaml for better community app catalog support.

    Fixed

    • Embed Chart CRD in app-operator to prevent hitting GitHub API rate limits.
    • When bootstrapping chart-operator the helm release should not include the cluster ID.
    • Fix getting kubeconfig in chart CR watcher.
    • Fix error handling in chart CR watcher when chart CRD not installed.

    cert-operator 1.3.0

    Changed

    • Use RenewSelf instead of LookupSelf to prevent expiration of Vault token.

    azure-operator 5.16.0

    Added

    • Add support for feature that enables forcing cgroups v1 for Flatcar version 3033.2.0 and above.

    Changed

    • Upgraded to giantswarm/exporterkit v1.0.0
    • Upgraded to giantswarm/microendpoint v1.0.0
    • Upgraded to giantswarm/microkit v1.0.0
    • Upgraded to giantswarm/micrologger v0.6.0
    • Upgraded to giantswarm/versionbundle v1.0.0
    • Upgraded to spf13/viper v1.10.0
    • Make nodepool nodes roll in case the user switches between cgroups v1 and v2
    • Drop dependency on giantswarm/apiextensions/v2
    • Bump k8scloudconfig to disable rpc-statd

    chart-operator 2.20.0

    Changed

    • Update Helm to v3.6.3.
    • Use controller-runtime client to remove CAPI dependency.

    Removed

    • Remove unused helm 2 release collector.

    external-dns 2.9.0

    This release contains some changes to mitigate rate limiting on AWS clusters. Please take note of the defaults for values aws.batchChangeInterval, aws.zonesCacheDuration, externalDNS.interval and externalDNS.minEventSyncInterval. If you already specify --aws-batch-change-interval or --aws-zones-cache-duration, please migrate to the new values aws.batchChangeInterval and aws.zonesCacheDuration.

    Added

    • Allow to set --aws-batch-change-interval through aws.batchChangeInterval value. Default 10s.
    • Allow to set --aws-zones-cache-duration through aws.zonesCacheDuration value. Default 3h.

    Changed

    • Set default externalDNS.interval to 5m.
    • Set default externalDNS.minEventSyncInterval to 30s.
    • Allow setting Route53 credentials (externalDNS.aws_access_key_id and externalDNS.aws_secret_access_key) indepentent from aws.access value.
    • Allow setting the AWS default region (aws.region) indepentent from aws.access value.
    • Allow to omit the --domain-filter flag completely by setting externalDNS.domainFilterList to null.

    azure-scheduled-events 0.6.0

    Added

    • Add priorityClassName: "system-node-critical" to Daemonset to give higher priority during scheduling.

    vertical-pod-autoscaler 2.1.1

    Fixed

    • Fix naming of VPA deployments in workload clusters.
  • This release provides initial support for creating clusters with Cluster API for Azure (CAPZ).

    Warning: This is an alpha preview release intended only for testing cluster creation. Upgrading to or from this version is not supported.

    Warning: There is a breaking change if kubectl is used manage machine pools. The MachinePool resource was moved to a new Kubernetes API Group named machinepools.cluster.x-k8s.io from machinepools.exp.cluster.x-k8s.io. The full resource path has to be specified when using kubectl-gs in order to access the machine pools created with the old API group. More details are available below.

    kubectl get machinepools.exp.cluster.x-k8s.io -A
    

    Clusters can be created with kubectl-gs by following the documentation.

    If you are interested how Cluster API changed our structure and what is coming next, please see our blog post.

    Change details

    Clusters are created in a similar way as regular Giant Swarm clusters by using the v20.0.0-alpha1 release with the kubectl gs command:

    kubectl gs template cluster --provider azure --release v20.0.0-alpha1 --organization giantswarm --description 'test' --output cluster.yaml
    kubectl gs template nodepool  --provider azure --release "v20.0.0-alpha1" --organization giantswarm --cluster-name "hc27f" --description "np1" --nodes-min 3 --nodes-max 10 --output nodepool.yaml
    

    At the moment, only MachinePool and AzureMachinePool Cluster API custom resources are supported.

    Machine pools with kubectl-gs

    There is a breaking change that you should be aware of if you use kubectl to manage MachinePools. The MachinePool and AzureMachinePool CRDs have been moved to a new kubernetes API Group. The MachinePool object has moved from machinepools.exp.cluster.x-k8s.io to machinepools.cluster.x-k8s.io, and the AzureMachinePool from azuremachinepools.exp.infrastructure.cluster.x-k8s.io to azuremachinepools.infrastructure.cluster.x-k8s.io.

    This means that when you use kubectl get machinepools you no longer will see your MachinePools, because they were created using the old API Group and kubectl defaults to the new API Group. To get the same behavior that we used to have you need to specify the old API Group for MachinePools

    kubectl get machinepools.exp.cluster.x-k8s.io -A
    

    This is not the case for AzureMachinePools. Using kubectl get azuremachinepools you will see AzureMachinePools using the old API Group.

  • This release downgrades Flatcar to version 2905.2.6 to restore version 1 of the kernel cgroups feature. Additionally latest Kubernetes 1.21 patch release and latest 1.21 cluster autoscaler version are applied.

    Change details

    containerlinux 2905.2.6

    Downgraded from 2983.2.0 to restore Cgroups v1.

    kubernetes 1.21.8

    Feature

    • Kubernetes is now built with Golang 1.16.11 (#106839, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]
    • Kubernetes is now built with Golang 1.16.12 (#106983, @cpanato) [SIG Cloud Provider, Instrumentation, Release and Testing]
    • Update golang.org/x/net to v0.0.0-20211209124913-491a49abca63 (#106961, @cpanato) [SIG API Machinery, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Node and Storage]

    Bug or Regression

    • Fix: skip instance not found when decoupling vmss from lb (#105835, @nilo19) [SIG Cloud Provider]
    • Fixed SELinux relabeling of CSI volumes after CSI driver failure. (#106553, @jsafrane) [SIG Node and Storage]
    • Kubeadm: allow the “certs check-expiration” command to not require the existence of the cluster CA key (ca.key file) when checking the expiration of managed certificates in kubeconfig files. (#106929, @neolit123) [SIG Cluster Lifecycle]
    • Kubeadm: during execution of the “check expiration” command, treat the etcd CA as external if there is a missing etcd CA key file (etcd/ca.key) and perform the proper validation on certificates signed by the etcd CA. Additionally, make sure that the CA for all entries in the output table is included - for both certificates on disk and in kubeconfig files. (#106924, @neolit123) [SIG Cluster Lifecycle]
    • The scheduler’s assumed pods have 2min instead of 30s to receive nodeName pod updates (#106632, @ahg-g) [SIG Scheduling]

    Dependencies

    Added

    Nothing has changed.

    Changed

    • golang.org/x/net: 3d97a24 → 491a49a

    Removed

    Nothing has changed.

    cluster-autoscaler 1.21.2-gs1

    Changed

    • Upgraded to upstream version 1.21.2.

    cert-exporter 2.0.1

    Changed

    • Equalise labels in the helm chart.

    external-dns 2.7.0

    Changed

    • Upgrade upstream external-dns from v0.9.0 to v0.10.2. The new release brings a lot of smaller improvements and bug fixes.
    • Remove support for Kubernetes <= 1.18.

    Fixed

    • Fix dry-run option.