Skip to main content

OKD 4.20 Release Issues

· One min read
Jaime Magiera
Co-chair, OKD Working Group

With OKD 4.20 officially released, several issues have surfaced. Below are brief descriptions of each issue and suggested workarounds. Please reach out if you come across additional issues.

2255: Upgrade to 4.20.0-okd-scos.0 - nodes fail to start due to incorrect runtime

Effects: Upgrades from 4.18 or lower to 4.20

If a cluster was originally installed with 4.18 or earlier, there will exist two machine config objects (one for worker and one for master nodes), that explicitly override the container runtime to be runc:

  • 00-override-master-generated-crio-default-container-runtime
  • 00-override-worker-generated-crio-default-container-runtime

These two machine config objects can simply be deleted before initiating the upgrade to 4.20. Doing so will change the container runtime to crun immediately, for the current cluster version. This means the machine config operator will reconfigure and restart all nodes. This should complete before the upgrade to 4.20 is initiated.

OKD 4.20 Release Notes

· 8 min read

Release Notes: 4.20.0-okd-scos.0

Introduction: Transition to CentOS Stream 10

OKD 4.20 marks a significant platform update, transitioning the underlying operating system from Fedora CoreOS (FCOS) to CentOS Stream 10. This strategic change aligns OKD with the future development of OpenShift on RHEL 10, providing early feedback and enhancing stability for the community distribution (OKD-240, OKD-241).


Installation and Platform Management

This release introduces major enhancements to platform architecture, installation flexibility, and day-to-day cluster management.

Platform and Architecture

  • Migration to Cluster API (CAPI): The Machine API (MAPI) is migrating its underlying implementation to use Cluster API (CAPI) for AWS and standalone clusters. This change is transparent to users, and the existing MAPI remains fully supported. A new CAPI operator manages components and enhances load balancer management on AWS, Azure, and GCP.
  • Selectable etcd Database Size (GA): You can now configure the etcd database size beyond its previous 8GB limit. This feature is now Generally Available and helps support large or dense clusters (ETCD-638). Liveness probes are now tuned dynamically based on the database quota to improve stability (ETCD-590).
  • Two-Node Cluster Non-Graceful Recovery: For two-node edge deployments, the cluster can now automatically recover from ungraceful shutdown events like power loss. One node will "fence" the other, restart etcd, and allow the failed node to rejoin safely, improving resilience without manual intervention (Ocpedge-1755).
  • AutoNode for ROSA-HCP: A new node autoscaling solution named AutoNode, powered by Karpenter, is available for ROSA with Hosted Control Planes (ROSA-HCP).

Installation and Updates

  • Update Precheck Command: A new oc adm upgrade recommend command helps administrators identify potential issues before a cluster upgrade, including checks for control plane health, active alerts, and image registry access (OTA-1560).
  • Flexible Node Storage Configurations: The Machine Config Operator (MCO) can now ignore non-reconcilable storage configurations, allowing new nodes with different disk layouts to be added to machine pools without causing errors.
  • Faster Azure Installs with RHCOS Marketplace Images: The installer can now use Red Hat CoreOS (RHCOS) images directly from the Azure Marketplace, significantly reducing installation time by avoiding a custom image upload (CORS-3652).
  • Enhanced GCP Installation: Deployments on GCP Shared VPC (XPN) now support a three-project architecture, allowing DNS to be managed in a separate service project (CORS-4044). Clusters can also use custom private GCP API endpoints for stricter security (CORS-3916).

vSphere Enhancements

  • Multi-NIC Support (GA): Support for creating vSphere virtual machines with multiple network interface controllers (NICs) is now Generally Available and enabled by default (SPLAT-2045).
  • Host Group Mapping: OpenShift zones can now be mapped to vSphere host groups for improved node distribution.

Storage

Storage performance, security, and driver capabilities have been significantly enhanced in this release.

  • Namespace-Level Storage Policies (GA): The StoragePerformantSecurityPolicy feature is now Generally Available. Administrators can define default storage security policies at the namespace level by applying the storage.openshift.io/fsgroup-change-policy and storage.openshift.io/selinux-change-policy labels. This can significantly improve pod startup time for persistent volumes.
  • AWS EFS Single-Zone Volume Support: The AWS EFS CSI driver now supports creating cost-effective, single-availability-zone volumes using the new --single-zone flag.
  • Volume Populator Data Source Validation (GA): The Volume Populators feature is now Generally Available. A new volume-data-source-validator controller is installed by default to validate the dataSourceRef field in a PersistentVolumeClaim (PVC), providing immediate feedback on invalid configurations.
  • Improved Storage Operator Resiliency: The PodDisruptionBudget for all storage operators has been updated with unhealthyEvictionPolicy: AlwaysAllow to ensure critical storage pods can be rescheduled during node maintenance or failures.
  • Manila CSI Plugin Enhancements (OpenStack): The Manila CSI plugin now supports configuring multiple share access rules for a single shared file system, allowing multiple clients to mount and access the same share simultaneously (OSPRH-18263).
  • CSI Drivers and Sidecars Updated: Multiple Container Storage Interface (CSI) drivers (AWS EBS, Azure Disk, Azure File, GCP PD, IBM VPC Block) and sidecar components have been updated to their latest upstream versions.

Networking

This release introduces native BGP support, dual-stack networking on AWS, and greater configuration flexibility.

  • BGP Integration for User-Defined Networks (UDN): OVN-Kubernetes now includes native BGP support. This allows the cluster to dynamically advertise pod IP subnets to external provider networks and learn routes from them, simplifying network integration for on-premise UDNs.
  • Dual-Stack Networking for OpenShift on AWS: OpenShift clusters deployed on AWS now support dual-stack (IPv4 and IPv6) networking (CORS-4136).
  • Azure NAT Gateway for Egress Traffic (GA): Support for using Azure NAT Gateway to manage outbound cluster traffic is now Generally Available.
  • Post-Deployment Network Configuration: Disruptive network changes to the br-ex interface can now be applied automatically on node reboot by modifying the NMState configuration file, simplifying advanced network changes (OPNET-594).

Developer Experience and Console

The user experience has been improved with a unified software catalog, enhanced developer tools, and streamlined image management.

Console and User Experience

  • New Ecosystem Navigation: A new top-level Ecosystem section in the navigation centralizes software management, including a Unified Catalog that provides a single place to discover and manage all cluster extensions from OperatorHub.
  • Custom Application Icons in Topology: You can now define a custom icon for your application nodes in the Topology view by adding the app.openshift.io/custom-icon annotation to your workloads (ODC-7803).
  • YAML Editor Improvements: The YAML editor now features a full-screen mode, a "Copy to clipboard" button, and togglable "sticky scroll" for easier navigation.
  • Modernized Web Terminal: The web terminal has been updated to use standard PatternFly components, providing a more consistent UI and new features like closing tabs with a middle-click (ODC-7802).

Image Management

  • ImageStream Multi-Architecture Support: On multi-architecture clusters, ImageStreams now default to importMode: preserveOriginal, ensuring the complete manifest list is preserved when importing a multi-architecture image (MULTIARCH-4552).
  • Registry Pre-flight Checks for oc-mirror: The oc-mirror v2 tool now performs "fail-fast" pre-flight checks to validate the connection to the destination registry, preventing long waits on simple configuration errors (CLID-389).

Security

Security posture is enhanced with default image signature validation, read-only filesystems, and expanded network policies.

  • Default Sigstore Image Validation (GA): The ClusterImagePolicy and ImagePolicy APIs for sigstore are now Generally Available, and the default policy to validate platform images is enabled by default. This strengthens software supply chain security out of the box (OCPNODE-3611).
  • Read-Only Root Filesystems: To enhance security, several core components now run with a read-only root filesystem by default, including pods for OLM, the integrated registry, CVO, and the openshift-kube-scheduler.
  • Network Policies for Core Components: To reduce the potential attack surface, network policies that restrict traffic have been implemented for numerous components, including storage operators and CSI drivers, OLM, Cloud Credential Operator (CCO), MAPI, and CAPI.
  • User Namespaces (GA): The User Name Space feature, which enhances security by allowing pods to run in isolated user namespaces, is now Generally Available.

Deprecations, Removals, and Feature Graduations

Removals and Deprecations

  • Support for Image Manifest Schema 1 Removed: To align with modern container standards, support for the deprecated image manifest schema 1 has been completely removed (WRKLDS-1599).
  • Cgroup v1 Support Removed: Support for cgroup v1 is completely removed. Clusters must be migrated to cgroup v2 before upgrading (OCPNODE-2841).
  • Service Binding Plugin Removed: The Service Binding feature has been removed from the Developer Console, aligning with the deprecation of the Service Binding Operator (ODC-7722).
  • odo CLI Download Link Removed: The download link for the deprecated odo CLI tool has been removed from the "Command Line Tools" page (ODC-7790).
  • Legacy GCE Cloud Provider Resources Removed: Obsolete RBAC resources related to the legacy GCE cloud provider have been removed (WRKLDS-954).

Feature Graduations to General Availability (GA)

The following features are now Generally Available and enabled by default:

  • PinnedImageSets and MachineConfigNode APIs: These MCO features are now GA and their APIs have been promoted to v1.
  • ImageVolume: Allows container images to be used as a volume source for pods (OCPNODE-3121).
  • GCP Labels and Tags: The ability to configure GCP Labels and Tags via the Infrastructure API is now a standard feature (OAPE-232).
  • vSphere Multi-Disk Support: Provides stable support for attaching multiple disks in vSphere environments (SPLAT-2346).
  • Route Advertisements: The routeAdvertisements feature for BGP is now GA (CORENET-5704).
  • Multiple feature gates have been removed as their features are now stable, including MultiArchInstallAWS, MultiArchInstallGCP, PrivateHostedZoneAWS, CloudDualStackNodeIPs, and VSphereMultiVCenters.

OKD 4.19 Release Notes

· 16 min read

Release Notes: 4.19.0-okd-scos.0

This release includes updates across various components, introducing new features, managing feature gates, and resolving numerous bugs to enhance stability and functionality. 4.19.0-okd-scos.0 is the source of this information.

New Features

Several new capabilities and improvements have been introduced in this release:

  • Support for ServiceAccountTokenNodeBinding has been enabled via a feature gate.
  • The OLMv1 Single/OwnNamespace feature is now available behind a feature flag.
  • MachineConfigNodes (MCN) API has been updated to V1 with corresponding CRDs deployed.
  • The CPMSMachineNamePrefix feature gate has been promoted to the default feature set.
  • The GatewayAPIController feature gate has been enabled in the Default featureset and its implementation includes Validating Admission Policy for Gateway API CRDs. GRPC conformance tests have also been added for Gateway API. This feature is NOT supported for OKD because the Openshift service mesh operator, which this feature depends on, is not available as a community operator.
  • MAPI to CAPI migration has been added as a TechPreview feature.
  • DualReplica minimum counts have been added, and the feature has been dropped to DevPreview to enable separation of conflicting enum values.
  • The RouteExternalCertificate feature gate has been promoted to the default feature set with added E2E tests.
  • A Featuregate for the ConsolePlugin ContentSecurityPolicy API has been lifted.
  • MetricsCollectionProfiles has reached GA status.
  • Configuration for external OIDC now supports adding uid and extra claim mappings.
  • The OnClusterBuild featuregate has been promoted to GA.
  • Support for SEV_SNP and TDX confidential instance type selection on GCP has been added.
  • SELinuxMount and SELinuxChangePolicy have been added to DevPreview.
  • The infrastructure object now includes service endpoints and a feature flag.
  • An annotation for validated SCC type has been added.
  • Configuration for vSphere multi disk thinProvisioned has been added.
  • API Updates for GCP Custom API Endpoints have been added.
  • The MarketType field has been added to AwsMachineProviderConfig and validation for this field has been added.
  • UserDefinedNetworks (UDN) has been graduated to GA with associated test improvements.
  • The ClusterVersionOperator API and manifests have been added, including a controller.
  • The HighlyAvailableArbiter control plane topology has been added as a feature for techpreview, with support for changing the minimum for arbiter HA deployments.
  • The KMSEncryptionProvider Feature Gate has been introduced, with support for KMSv2 encryption for ARO HCP using MIv3 and related configuration options.
  • The additionalRoutingCapabilities gate has been promoted in the ClusterNetworkOperator API.
  • Support for vSphere host and vm group based zonal has been added.
  • A MachineNamePrefix field for CPMS has been feature-gated with its feature gate also added.
  • vSphere multi disk support has been added, including provisioning mode for data disks.
  • An initial Monitoring CRD api has been added.
  • The Insights runtime extractor feature has been moved to GA.
  • A new config option for storing Insights archives to persistent volume has been introduced.
  • Insight Operator entitlements for multi arch clusters have been enabled.
  • A liveness probe has been added to the Insights extractor container.
  • The LokiStack gatherer has been added to Insights.
  • CNI subdirectory chaining for composable CNI chaining is available.
  • The nodeslicecontroller has been added to the dockerfile for multus-whereabouts-ipam-cni.
  • The console has added numerous UI/UX improvements including PatternFly 6 updates, features like deleting IDPs, improved helm form in admin perspective, adding a default storage class action, guided tours in admin perspective, add-card item alignment fixes, conversion of HTML elements to PatternFly components, adding dark theme feedback graphic, adding a Getting started section to the project overview page, adding support for extensibility in SnapshotClass and StorageClass pages, adding a favoriting page in the Admin perspective, exposing Topology components to the dynamic plugin SDK, adding support for a Virtualization Engine subscription filter on OperatorHub, adding dev perspective nav options to the admin perspective, adding conditional CSP headers support, adding a Dynamic Plugins nav item, adding telemetry for OLS Import to Console, and adding a customData field to the HorizontalNav component.
  • The monitoring-plugin has been updated with PF-6 migration, improved metrics typeahead, label typeahead, plugin proxy for Perses, and the ability to embed Perses Dashboards.
  • Etcd now has a configurable option for hardware-related timeout delay.
  • GCP PD CSI Driver includes an Attach Limit for Hyperdisk + Gen4 VMs and has been rebased to upstream v1.17.4.
  • The GCP PD CSI Driver Operator can enable VolumeAttributesClass and add custom endpoint args from infrastructure.
  • HyperShift now supports adding a control plane pull secret reference, adding proxy trustedCA to ignition config, testing Azure KMS, capacity reservation in NodePool API, passing featuregates to ocm/oapi, enabling MIv3 for Ingress, configuring KAS goaway-chance, overriding the karpenter image, consuming the KubeAPIServerDNSName API, enabling ppc64le builds, syncing the OpenStack CA cert, limiting CAPI CRD installation on HO, annotating AWSEndpointServices, setting default AWS expirationDate tag, running the kas-bootstrap binary for cpov2, disabling the cluster capabilities flag, enabling MIv3 for Azure file CSI driver, enabling MIv3 for CAPZ, adding e2e tests for image registry capability, adding the konnectivity-proxy sidecar to openshift-oauth-apiserver, checking individual catalog image availability, handling multiple mirror entries, rolling out cpov2 workloads on configmap/secret changes, enabling MIv3 for CNO/CNCC on managed Azure, leveraging ORC to manage the release image on OpenStack, rootless containerized builds, enabling linters, allowing autonode to run upstream karpenter core e2e tests, adding a flag for etcd storage size, auto-approving Karpenter serving CSRs, and providing AWS permission documentation.
  • Machine API Operator supports updating GCP CredentialsRequest, e2e tests for vSphere multi network and Data Disk features, AMD SEV_SNP and TDX confidential computing machines on GCP, adding image/read permissions, adding vSphere check for max networks, adding Azure permissions.
  • vSphere Problem Detector supports host groups.
  • Various tests have been updated or added to support new features and platforms, including OLMv1 preflight permissions checks, MCN V1 API tests, OLMv1 catalogd API endpoint tests, Gateway API tests, testing ratcheting validations, detecting concurrent installer/static pods, platform type external support, and tests for the ImageStreamImportMode feature gate.

Feature Gates

  • CPMSMachineNamePrefix has been promoted to the default feature set.
  • GatewayAPIController has been enabled in the Default featureset. Its implementation includes Validating Admission Policy and is tied to the cluster-ingress-operator. (NOT applicable for OKD)
  • DualReplica minimum count has been added, separation of conflicting enum values enabled, and the feature dropped to DevPreview.
  • RouteExternalCertificate has been promoted to the default feature set.
  • ConsolePlugin ContentSecurityPolicy API feature gate has been lifted.
  • OnClusterBuild has been promoted to GA.
  • GatewayAPI has been re-enabled in the Default featureset and promoted to Tech Preview.
  • VSphereStaticIPs feature gate has been removed.
  • NewOLMPreflightPermissionCheck feature flag has been added and is watched by the cluster-olm-operator.
  • VSphereControlPlaneMachineSet feature gate has been removed.
  • KMS encryption is FeatureGate(d) and the KMSEncryptionProvider Feature Gate has been added.
  • DualReplica featuregate has been added.
  • SELinuxMount and SELinuxChangePolicy have been added to DevPreview.
  • The catalogd metas web api is behind a featuregate.
  • A Feature Gate AND on NetworkLoadBalancer CEL has been added.
  • HighlyAvailableArbiter control plane topology is a feature for techpreview.
  • Persistent Ips feature gate has graduated to GA.
  • MachineNamePrefix field for CPMS is feature-gated with its feature gate also added.
  • CSIDriverSharedResource feature gate has been removed.
  • The ShortCertRotation feature gate has been added and is used to issue short lived certificates in the cluster-kube-apiserver-operator and service-ca-operator.
  • The UserDefinedNetworks feature gate has graduated to GA.
  • The additionalRoutingCapabilities gate has been promoted.
  • The ImageRegistryCapability has been introduced behind a feature gate in HyperShift and tested.
  • The Dynamic Configuration Manager feature gate has follow-up work to be enabled.
  • The cluster-olm-operator watches for the APIV1MetasHandler feature gate.
  • The cluster-olm-operator watches for permissions preflight feature gate.
  • The service-ca-operator does not check featuregates on the operand.

Other Feature Gates Enabled by Default:

  • ConsolePluginContentSecurityPolicy: Status is Enabled in the Default set. The featuregate was lifted for this API. This gate was added to the console-operator.
  • OpenShiftPodSecurityAdmission: Status is Enabled in the Default set.
  • ClusterVersionOperatorConfiguration: Status is Enabled (New) in the Default set.
  • DyanmicServiceEndpointIBMCloud: Status is Enabled (New) in the Default set.
  • GCPCustomAPIEndpoints: Status is Enabled (New) in the Default set. There were API updates for GCP Custom API Endpoints.
  • NewOLMCatalogdAPIV1Metas: Status is Enabled (New) in the Default set. The featuregate for catalogd metas web API was added and is watched for.
  • NewOLMOwnSingleNamespace: Status is Enabled (New) in the Default set. A feature flag was added for OLMv1 Single/OwnNamespace.
  • NewOLMPreflightPermissionChecks: Status is Enabled (New) in the Default set. A feature flag for this was added and is watched for.
  • SigstoreImageVerificationPKI: Status is Enabled (New) in the Default set. A PKI field was added to the image API.
  • VSphereConfigurableMaxAllowedBlockVolumesPerNode: Status is Enabled (New) in the Default set. The MaxAllowedBlockVolumesPerNode field was added to the VSphereCSIDriverConfigSpec.
  • VSphereMultiDisk: Status is Enabled (New) in the Default set. Support for vSphere multi disk was added.
  • ClusterAPIInstallIBMCloud: Status changed from Disabled to Enabled in this set. This feature flag was added to Tech Preview.
  • MachineAPIMigration: Status changed from Disabled to Enabled in this set. MAPI to CAPI migration was added to TechPreview.

Bug Fixes

Numerous bugs have been addressed in this release across various components:

  • Validation for the marketType field in aws-cluster-api-controllers has been added.
  • Fixed issues using 127.0.0.1 for healtz http-endpoints, corrected ASH driver inject env config, and fixed PodDisruptionBudget name for openstack-manila.
  • Azure Stack Hub volume detach failure has been fixed.
  • Panic issues in Azure Stack related to GetZoneByNodeName and when the informer receives cache.DeletedFinalStateUnknown have been fixed.
  • GovCloud Config has been fixed.
  • Cross-subscription snapshot deletion is now allowed in azure-file-csi-driver. CVEs related to golang.org/x/crypto and golang.org/x/net have been addressed.
  • Fixes in the CLI include addressing rpmdiff permissions, using ProxyFromEnvironment for HTTP transport, adjusting the impact summary for Failing=Unknown, populating RESTConfig, bumping glog and golang.org/x/net/crypto dependencies for fixes, ensuring monitor doesn't exit for temp API disconnect, fixing the oc adm node-image create –pxe command, parsing node logs with HTML headers, and obfuscating sensitive data in Proxy resource inspection.
  • Logo alignment in Webkit has been fixed in cluster-authentication-operator. Duplicate OAuth client creation is avoided. An issue updating the starter path for mom integration has been fixed. Etcd readiness checks are excluded from /readyz.
  • Broken ControlPlaneMachineSet integration tests have been fixed. A spelling error in the FeatureGate NewOLMCatalogdAPIV1Metas has been fixed. A typo in insightsDataGather has been fixed. A race in tests using CRD patches has been fixed. Handling of validations requiring multiple feature gates has been fixed. Missing CSP directives have been added. StaticPodOperatorStatus validation for downgrades and concurrent node rollouts has been fixed. Insights types duration validation has been fixed. An example format validation has been added. Unused MAPO fields have been deprecated. Reverted Disable ResilientWatchCacheInitialization.
  • IBM Public Cloud DNS Provider Update Logic has been fixed, along with IBMCloud DNS Propagation Issues in E2E tests. A test is skipped when a specific feature gate is enabled. Single Watch on GWAPI CRD issue has been fixed.
  • Dev cert rotation has been reverted in cluster-kube-apiserver-operator. Etcd endpoints are now checked by targetconfigcontroller. Metrics burn rate calculations and selectors have been adjusted or fixed. Skipping cert generation when networkConfig.status.ServiceNetwork is nil has been fixed. Reverted Disable ResilientWatchCacheInitialization.
  • The gracefully shutdown of the KSVM pod has been fixed.
  • Error handling on port collision in CVO has been improved. A few tests failing on Non-AMD64 machines have been fixed. Unknown USC insights are dropped after a grace period. The preconditions code has been simplified.
  • Numerous console UI/UX and functional bugs have been fixed, including list header wrapping, http context/client handling, quick create button data-quickstart-id, critical alerts section collapsing, runtime errors on MachineConfigPools, switch animation regressions, ACM hiding switcher, favorites button name, listpageheader rendering, tab underline missing, notification drawer spacing, withHandlePromise HOC deprecation, quick start action spacing, operator appearing twice, breadcrumb spacing, web terminal initialize form style, quickstart highlighting, base CSS removal/conversion, VirtualizedTable and ListPageFilter deprecation, OLM CSV empty state link, helpText usage, add card item alignment, ErrorBoundary modal link, DualReplica validation hack, fetching taskRuns by UID, catalog view cleanup, PF6 bug fixes, deployment editing from private git, co-resource-icon clipping, notification drawer keyboard navigation, flaking update-modal tests, orphaned CSS class removal, PDB example YAML missing field, Error state component groups, Developer Catalog renaming, Favorites e2e tests, secret form base64 decoding, typo on tour page, helm chart repository name, SnapshotClass/StorageClass extensibility, plugin type-only warnings, react-helmet/react-measure migration, pipeline ci tests disabling, plugin-api-changed label, getting started alert, perspective merge tests, react-modal/react-tagsinput updates, init containers readiness count, notification drawer overlap, static plugin barrel file references, CaptureTelemetry hooks, flaky Loading tests, admin perspective guided tour disabling, Access review table sort, types/react update, getting started resources content, Node Logs toolbar layout, Loading replacement, favorites icon hover effect, LogViewer theme setting, namespace persistence on perspective switch, secret form drag and drop, logoutOpenShift call removal, NodeLogs Selects closing, missing patternfly styles, monaco theming/sidebar logic, Banner replacement, ODC Project details breadcrumbs, resource list page name filter alignment, VolumeSnapshots not displayed, ResourceLog checkbox replacement, ts-ignore removal, Checkbox filter replacement, monitoring topic update, original path retention on perspective detection, monaco/YAML language server update, subscription values display, Jobs createdTime, CLI links sorting, bottom notifications alignment, notification drawer close button error, Timestamp component, unused static plugin modules, edit resource limit margins, CSRs not loading without permissions, async package upgrade, bold text/link underline issues, dropdown menu overflow, contextId for plugin tabs, OLM operator uninstall message linkify, Observe section display, textarea horizontal expansion, Topology sidebar alert storage, Demo Plugin tab URL, Command Line Terminal tab background color, basic authentication secret type, runtime errors for completed version, QueryBrowser tooltip styles, edit upstream config layout, deployment pod update on imageStream change, Bootstrap radio/checkbox alignment, QuickStart layout, guided tour popover overlap, Edit button bolding, cypress config update, bridge flag for CSP features, CSV details plugin name, Pipeline Repository overview page close button, Topology component exposure, catalog card label alignment, YAMLs directory case sensitivity, Search filter dropdown label i18n, broken codeRefs, CSP headers refresh popover, dev-console cypress test update, plugin name parsing variable, dependency assets copying, ns dropdown UI with web terminal, SourceSecretForm/BasicAuthSubform tech debt, create a Project button, GQL query payload size, non-General User Preference navigation, openshift Authenticate func user token, catalog operator installation parameters, telemetry events OpenShift release version preference, web terminal test failures, errors appending via string, external link icons, BuildSpec details heading font size, capitalization fix for Lightspeed, i18n upload/download, Font Awesome icon alignment, Serverless function test no response, Post TypeScript upgrade changes, helm CI failures, TypeScript upgrade, GQL introspection disabling, code removal, axe-core/cypress-axe upgrade, search tool error, PopupKebabMenu/ClusterConfigurationDropdownField removal, operator installation with + in version name, missing PDB violated translation, Number input focus layout, AlertsRulesDetailPage usage, guessModuleFilePath warnings, channel/version dropdown collapse, webpack 5 upgrade, check-resolution parallel run, Init:0/1 pod status, window.windowErrors saving, ConsolePlugins list display, backend service details runtime error, Function Import error, default StorageClass for ServerlessFunction pipelineVolumeClaimTemplate, Save button enablement in Console plugin enablement, ImagePullSecret duplication, Shipwright build empty params filtering.
  • The managed-by-label populated with an invalid value has been fixed in external-provisioner. CVEs related to golang.org/x/net/crypto have been addressed.
  • Etcd ensure cluster id changes during force-new-cluster, and a compaction induce latency issue has been fixed.
  • Volume unpublish and attachment through reboots has been ensured for kubevirt-csi-driver.
  • A temporary pin on the FRR version has been applied in metallb-frr to a known working rpm.
  • Monitoring plugin fixes include updates to avoid overriding console routes, table scroll/column alignment, performance improvements for incidents page, resetting orthogonal selections, not breaking if cluster doesn’t exist, filtering by cluster name, showing column headings, fixing states filter in aggregated row, clearing old queries, fixing silence alerts data form, re-adding CSV button, allowing refresh interval to be off, removing deleted image dependency, Export as CSV, not showing metrics links in acm perspective, updating datasource on csrf token changes, adding mui/material dependency, fixing typo in predefined metrics, fixing virtualization perses extension point, filter dropdowns, alerts timestamps cutoff, incidents page filters, incidents page loading state, net/http vulnerability, tooltip in row details, fixing incidents filter issues with severities and long standing, incidents dark theme, syncing alert chart to main filter, hotfix for filter requirements, alerting refactor, virtualization perspective routes, potentially undefined variable access, incident chart colors, incidents filter logic/sync, syncing alerts chart/incidents table with days filter, sorting chart bars, reverting reset all filters button, fixing gap in incident charts, using pf v5 variables/table, fixing dev perspective alert URL namespace, incidents page date style, hideshow graph button update, incidents page reset filters, fixing admin console alert detail graph, fixing button spacing on silence form, fixing bounds on bar chart, fixing inverted dropdown toggle, allowing editing of the until field on the silence edit page, fixing feature flagged DX, fixing expanded row rendering, upgrading incidents dropdown, updating incidents charts cursor, removing extra copy.
  • Issues writing network status annotation on CNI ADD have been tolerated in multus-cni. Empty CNI result is properly structured. Getpodcontext cache miss has been fixed.
  • Entrypoint issues have been fixed for multus-whereabouts-ipam-cni, including for new SCOS builds.
  • An error event has been added for failed ingress to route conversion in route-controller-manager.
  • Drop nil metrics during elide transform and capture metric for same has been fixed in telemeter, along with checking nil metric in elide label.
  • Numerous test fixes have been implemented, including increasing timeouts, bumping limits, skipping tests, fixing node selection in MCN tests, fixing MCN tests for two-node clusters, preventing tests using unschedulable nodes, fixing default cert issuer name in RouteExternalCertificate tests, ensuring Git Clone does not run privileged, fixing failed arbiter tests, removing skipped annotation for metal ipv6, adding limit exceptions for Istio, adding cleanup to MCN test, removing CRD schema check, fixing broken intervals charts, fixing egress firewall tests URLs, fixing CBOR data decoding in etcd tests, fixing IPsec tests, validating binary extraction, failing test when operator degrades, using payload pullspec for image info, using non-fake boot image, relying on unstructured for update status, checking load balancer healthcheck port/path, allowing overriding extension binary, re-enabling AWS for router HTTP/2 test, displaying etcd bootstrap event, fixing network name change compatibility, increasing timeouts for live migration, addressing malformed configmap post-test, increasing UDN probe timeouts, adding exceptions outside upgraded window, adding Readiness Probe to Router Status Tests, adding error check for failed cleanup, fixing live migration tests detecting dualstack, extending kubeconfig tests, fixing IPv6 handling in router tests, fixing live migration tests, UDN tests waiting for SCC annotation, fixing auditLogAnalyzer flake error, fixing nmstate deployment failures, showing resources updated too often in auditloganalyzer, skipping OperatorHubSourceError metric checking, adding test case for checking EgressFirewall DNS names, fixing network segmentation eventual consistency, increasing KAPI server timeout, using max time for netpol pods curl requests, moving initialization of OC.
  • Datastore check messages have been improved in vsphere-problem-detector.

OKD 4.19 stable and 4.20 ec have released

· One min read

We’re excited to announce that OKD 4.19.0-scos.0 has been officially promoted to the stable release channel!

You can view the release payload here: 4.19.0-okd-scos.0 and compare the differences with the last stable 4.18 release.

A few significant highlights of this release include:

  • Bootimages and node images are now based on Centos stream CoreOS (scos)
  • Bootimages are available publicly at: https://cloud.centos.org/centos/scos/9/prod/streams/
  • Baremetal installs, assisted and agent based installs work seamlessly now that bootimages have been transitioned to scos
  • Upgrade edges have been added from previous stable release to the new release

Alongside this stable release, we’re also publishing a development preview of the next version: 4.20.0-okd-scos.ec.0 – now available on the 4-scos-next channel for early testing and feedback.

We encourage users and contributors to test the new releases and share feedback via the OKD community channels. Stay tuned for more updates!

Say Hello @ KubeCon EU 2025

· One min read
Zed Spencer-Milnes
Co-chair, OKD Working Group

Members of the OKD Working Group are attending KubeCon EU!

We are looking forward to meeting existing and future users of OKD and talk to other members of the ecosystem about OKD.

OKD Meetup

  • When: 3:30pm - 4:30pm - Tuesday, April 1st 2025
  • Where: Crowne Plaza London Docklands
  • What: No set agenda, just a room to talk all things OKD and meet fellow community members!
  • Who: No preregistration required! Users and Contributors of OKD are encourage to attend

This follows immediately after RedHat OpenShift Commons which you can find out more about here. You do not need to attend RedHat OpenShift Commons to join the OKD Meetup

Hotel Address: Crowne Plaza London Docklands, Royal Victoria Dock, Western Gateway, London, E16 1AL, United Kingdom

OKD 4.17 and 4.16 releases

· 3 min read
Zed Spencer-Milnes
Co-chair, OKD Working Group

We are pleased to announce the release of OKD 4.17, alongside OKD 4.16 to allow upgrades for existing 4.15 clusters.

warning

4.16 is intended only as a pass-through for existing 4.15 clusters. Upgrading existing 4.15 cluster will require manual interventions and special care due to major changes in how OKD is built and assembled which have introduced various side effects.

You're late, why?

Yes, we are. OKD builds became polluted with RHEL content that was included in "payload components" (e.g cluster-infrastructure operators, images, etc that made up OKD). This was highlighted in Summer 2023 and heading into 2024 all OKD releases were stopped until this issue was addressed.

After significant work from a few engineers at RedHat, all components that make up OKD should now be free from RHEL artifacts. This required significant work to build infrastructure and process and chasing issues related to discrepancies between CentOS and RHEL. Most OKD components are now based off CentOS Stream as the base image layer (the license-free upstream to RHEL).

I want to install a new cluster

New cluster installations can follow the normal process. Downloads of client tools with the latest versions of OKD 4.17 embedded can be found here.

I want to upgrade an existing cluster

We recommended attempting upgrades from the latest released version of OKD FCOS 4.15 (4.15.0-0.okd-2024-03-10-010116).

Upgrading existing 4.15 cluster will require manual interventions and special care due to major changes in how OKD is built and assembled which have introduced various side effects.

There is a new area for upgrade notes covering the 4.15 through 4.17

Node operating systems are now based off CentOS Stream CoreOS (SCOS)

As part of this work we have also changed the node operating system to be based off CentOS Stream CoreOS (SCOS) rather than Fedora CoreOS (FCOS). It's worth noting that this work was not part of the OKD Streams (where we produced concurrent releases for FCOS and SCOS) project which for now has been suspended.

The build process for SCOS and it's assembly into OKD in versions greater than 4.16 is vastly different to how it happened as part of OKD Streams in version 4.15 and below.

warning

There are known issues and regressions related to the move from FCOS to SCOS that may effect new and existing clusters. Please refer to OKD Upgrade Notes: From 4.15

Special thanks

The OKD Working Group would like to thank Prashanth Sundararaman of RedHat for their work

OKD Pre-Release Testing July 2024

· One min read
Jaime Magiera
Co-chair, OKD Working Group

Last month, we announced the transition of all development efforts to OKD on SCOS as part of a plan to ensure OKD's longevity. As of a few weeks ago, nightly builds of OKD SCOS have begun to appear on the OpenShift CI system. We're encouraging the community to test these nightlies in non-production environments. Please note that these nightly pre-release builds are not guaranteed an upgrade path to final releases. These are only for testing purposes.

Additionally, please note that the OKD SCOS nightly builds from January-April 2024 should not be installed. These were just tests of the CI/CD process itself. Only the builds from July 2024 onward should be installed and tested.

You can find more information about our testing needs and how to report your results on the Community Testing page.

Please reach out to us with any questions.

Many thanks,

The OKD Working Group Co-Chairs

OKD Working Group Statement

· 3 min read
Jaime Magiera
Co-chair, OKD Working Group

We would like to take a moment to outline what's been happening the past few months in terms of OKD releases and what the future holds for the project.

In Summer of 2023, it came to the attention of Red Hat that licensed content was inadvertently being included in the OKD releases. This necessitates a change of the OKD release materials. At the same time, the Working Group has been striving to increase the community's direct involvement in the build and release process. To address these concerns, Red Hat and the Working Group have been collaborating on defining a path forward over the past few months. This work involves moving OKD builds to a new system, changing the underlying OS, and exposing the new build and release process to community members.

After careful consideration, we've settled on using Centos Stream CoreOS (SCOS) as the underlying operating system for the new builds. We've been working with SCOS since it was first announced at KubeCon U.S. 2022. There's a great opportunity with SCOS for the larger Open Source community to participate in improving OKD and further delineating it from other Kubernetes distributions. The builds will be for x86_64 only while we get our bearings. Given rpm-ostree is the foundation of all modern OKD releases, many existing installations will be able to switch to the SCOS distribution in-place. We're working to outline that procedure in our documentation and identify any edge-cases that may require more work to transition.

The payload for OKD on SCOS is now successfully building. There are still end-to-end tests which need to complete successfully and other housekeeping tasks before pre-release nightly builds can spin up an active cluster. We anticipate this happening within the next few weeks. At that point, members of the community will be able to download these nightly builds for testing and exploration purposes.

On the community involvement and engagement side of things, we'll be relaunching our website to align with the first official release of OKD on SCOS. That site will feature much clearer paths to the information users want to get their clusters up and running. We're redoubling our efforts to help homelabs, single-node, and other similar use cases get off the ground. Likewise, the new website will provide much clearer information on how community members can contribute to the project.

We appreciate everyone's patience over the past few months while we solidified the path forward. We wanted to be confident the pieces would fit together and bring about the desired results before releasing an official statement. From here on out, there will be regular updates on our website.

We understand that there will be lots of questions as this process moves forward. Please post those questions on this discussion thread. We will organize them into this Frequently Asked Questions page.

Many thanks,

The OKD Working Group Co-Chairs

State of affairs in OKD CI/CD

· 6 min read
Jakob Meng
Red Hat

OKD is a community distribution of Kubernetes which is built from Red Hat OpenShift components on top of Fedora CoreOS (FCOS) and recently also CentOS Stream CoreOS (SCOS). The OKD variant based on Fedora CoreOS is called OKD or OKD/FCOS. The SCOS variant is often referred to as OKD/SCOS.

The previous blog posts introduced OKD Streams and its new Tekton pipelines for building OKD/FCOS and OKD/SCOS releases. This blog post gives an overview of the current build and release processes for FCOS, SCOS and OKD. It outlines OKD's dependency on OpenShift, an remnant from the past when its Origin predecessor was a downstream rebuild of OpenShift 3, and concludes with an outlook on how OKD Streams will help users, developers and partners to experiment with future OpenShift.

Fedora CoreOS and CentOS Stream CoreOS

Fedora CoreOS is built with a Jenkins pipeline running in Fedora's infrastructure and is being maintained by the Fedora CoreOS team.

CentOS Stream CoreOS is built with a Tekton pipeline running in a OpenShift cluster on MOC's infrastructure and pushed to quay.io/okd/centos-stream-coreos-9. The SCOS build pipeline is owned and maintained by the OpenShift OKD Streams team and SCOS builds are being imported from quay.io into OpenShift CI as ImageStreams.

OpenShift payload components

At the time of writing, most payload components for OKD/FCOS and OKD/SCOS get mirrored from OCP CI releases. OpenShift CI (Prow and ci-operator) periodically builds OCP images, e.g. for OVN-Kubernetes. OpenShift's release-controller detects changes to image streams, caused by recently built images, then builds and tests a OCP release image. When such an release image passes all non-optional tests (also see release gating docs), the release image and other payload components are mirrored to origin namespaces on quay.io (release gating is subject to change). For example, at most every 3 hours a OCP 4.14 release image will be deployed (and upgraded) on AWS and GCP and afterwards tested with OpenShift's conformance test suite. When it passes the non-optional tests the release image and its dependencies will be mirrored to quay.io/origin (except for rhel-coreos*, *-installer and some other images). These OCP CI releases are listed with a ci tag at amd64.ocp.releases.ci.openshift.org. Builds and promotions of nightly and stable OCP releases are handled differently (i.e. outside of Prow) by the Automated Release Tooling (ART) team.

OKD payload components

A few payload components are built specifically for OKD though, for example OKD/FCOS' okd-machine-os. Unlike RHCOS and SCOS, okd-machine-os, the operating system running on OKD/FCOS nodes, is layered on top of FCOS (also see CoreOS Layering, OpenShift Layered CoreOS).

Note, some payload components have OKD specific configuration in OpenShift CI although the resulting images are not incorporated into OKD release images. For example, OVN-Kubernetes images are built and tested in OpenShift CI to ensure OVN changes do not break OKD.

OKD releases

When OpenShift's release-controller detects changes to OKD related image streams, either due to updates of FCOS/SCOS, an OKD payload component or due to OCP payload components being mirrored after an OCP CI release promotion, it builds and tests a new OKD release image. When such an OKD release image passes all non-optional tests, the image is tagged as registry.ci.openshift.org/origin/release:4.14 etc. This CI release process is similar for OKD/FCOS and OKD/SCOS, e.g. compare these examples for OKD/FCOS 4.14 and with OKD/SCOS 4.14. OKD/FCOS's and OKD/SCOS's CI releases are listed at amd64.origin.releases.ci.openshift.org.

Promotions for OKD/FCOS to quay.io/openshift/okd (published at github.com/okd-project/okd) and for OKD/SCOS to quay.io/okd/scos-release (published at github.com/okd-project/okd-scos) are done roughly every 2 to 3 weeks. For OKD/SCOS, OKD's release pipeline is triggered manually once a sprint to promote CI releases to 4-scos-{next,stable}.

OKD Streams and customizable Tekton pipelines

However, the OKD project is currently shifting its focus from doing downstream rebuilds of OCP to OKD Streams. As part of this strategic repositioning, OKD offers Argo CD workflows and Tekton pipelines to build CentOS Stream CoreOS (SCOS) (with okd-coreos-pipeline), to build OKD/SCOS (with okd-payload-pipeline) and to build operators (with okd-operator-pipeline). The OKD Streams pipelines have been created to improve the RHEL9 readiness signal for Red Hat OpenShift. It allows developers to build and compose different tasks and pipelines to easily experiment with OpenShift and related technologies. Both okd-coreos-pipeline and okd-operator-pipeline are already used in OKD's CI/CD and in the future okd-payload-pipeline might supersede OCP CI for building OKD payload components and mirroring OCP payload components.