Skip to content

Conversation

@perdasilva
Copy link
Contributor

Description

Just having some fun experimenting with Claude and sketching changes to OLM:

  • ClusterExtensionRevision status: Substitute all conditions for just one: Ready + add phase level status
  • Move ProgressionTimeout to ClusterExtension (might make sense errors like bundle image pulling issues, etc.)
  • Make RevisionStates just a list of RevisionMetadata and drive Progressing and Resolution changes from that
  • Add ClusterExtension generation to list of ClusterExtensionRevision annotations and let direct changes to the CE trigger the resolver even if there revisions rolling out (manual interventions to get CEs unstuck)

Note: All of this was generated through prompting 🤣

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

Copilot AI review requested due to automatic review settings February 11, 2026 15:06
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 11, 2026
@openshift-ci
Copy link

openshift-ci bot commented Feb 11, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign pedjak for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link

netlify bot commented Feb 11, 2026

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit 8adcfc3
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/698c9e435376580008edfc86
😎 Deploy Preview https://deploy-preview-2503--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@perdasilva
Copy link
Contributor Author

/hold not to be merged

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 11, 2026
…el status reporting

Consolidate the three separate revision conditions (Available, Progressing,
Succeeded) into a single "Ready" condition with granular reason codes
(Ready, Reconciling, ProbeFailure, ValidationFailed, ObjectCollision,
ProgressDeadlineExceeded, Transitioning, RollingOut, Archived). This
simplifies the revision status model while providing more specific failure
information.

Key changes:
- Replace Available/Progressing/Succeeded conditions with a unified Ready
  condition on ClusterExtensionRevision
- Add per-phase status reporting (phaseStatuses) with states: Applied,
  Progressing, Failed, Pending, Transitioning - including failing probe
  details per object
- Move progress deadline checking from the revision controller to the
  ClusterExtension reconcile steps, tracking deadlines via sync.Map per
  ClusterExtension UID
- Remove progressDeadlineMinutes from ClusterExtensionRevision spec (now
  driven by ClusterExtension spec only)
- Refactor RevisionStates from a struct with Installed/RollingOut fields
  to a sorted slice with Installed()/RollingOut()/Latest() accessor methods,
  adding State and ClusterExtensionGeneration fields to RevisionMetadata
- Track ClusterExtension generation in revision annotations to enable
  generation-aware resolution skipping during active rollouts
- Add ServiceAccount watch to revision controller for parent
  ClusterExtension
- Update CRD printer columns from Available/Progressing to Ready/Reason
- Add test catalog entries (test-operator 1.0.2 and 1.0.3)
- Update all tests and generated manifests/CRDs

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is an experimental playground for significant architectural changes to OLM's ClusterExtensionRevision status model. The changes aim to simplify the status conditions and move certain configuration from the revision level to the extension level.

Changes:

  • Consolidates three ClusterExtensionRevision conditions (Available, Progressing, Succeeded) into a single Ready condition with multiple reasons
  • Moves ProgressDeadlineMinutes configuration from ClusterExtensionRevision to ClusterExtension level
  • Refactors RevisionStates from a struct with separate Installed/RollingOut pointers to a flat slice with helper methods
  • Adds ClusterExtension generation tracking on revisions via annotations to enable spec change detection
  • Introduces detailed phaseStatuses field with per-phase state, probe failures, and transition tracking

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
api/v1/clusterextensionrevision_types.go Defines new Ready condition with comprehensive reasons, adds phaseStatuses types, removes progressDeadlineMinutes field
api/v1/clusterextension_types.go Minor formatting cleanup of constant definitions
api/v1/validation_test.go Removes tests for progressDeadlineMinutes validation (moved to ClusterExtension level)
api/v1/zz_generated.deepcopy.go Adds generated DeepCopy methods for new phaseStatuses types
internal/operator-controller/labels/labels.go Adds ClusterExtensionGenerationKey constant for tracking extension generation
internal/operator-controller/controllers/clusterextension_controller.go Refactors RevisionStates to slice type with helper methods, adds generation tracking to both Helm and Boxcutter getters
internal/operator-controller/controllers/clusterextension_reconcile_steps.go Implements CheckProgressDeadline step at ClusterExtension level, adds generation-based resolution skip logic
internal/operator-controller/controllers/clusterextensionrevision_controller.go Replaces multiple condition types with single Ready condition, removes revision-level progress deadline logic, adds comprehensive phase status tracking, adds ServiceAccount watch
internal/operator-controller/controllers/boxcutter_reconcile_steps.go Updates revision state getter to use Ready condition, implements new Progressing condition logic based on revision states
internal/operator-controller/controllers/common_controller.go Updates to use new RevisionStates slice API and Ready condition
internal/operator-controller/controllers/*_test.go Updates all tests to use new condition names and RevisionStates API
internal/operator-controller/applier/boxcutter.go Removes progressDeadlineMinutes propagation, updates migration status to use Ready condition
cmd/operator-controller/main.go Adds CheckProgressDeadline step to both reconciler configurations
manifests/*.yaml Updates CRDs with new condition documentation and phaseStatuses schema
helm/olmv1/base/operator-controller/crd/experimental/*.yaml Mirrors CRD changes in Helm charts
testdata/images/catalogs/test-catalog/v1/configs/catalog.yaml Adds intermediate test bundle versions for testing
Comments suppressed due to low confidence (2)

internal/operator-controller/controllers/suite_test.go:66

  • When m.Err is not nil, returning nil as the first parameter creates a nil RevisionStates slice. However, the actual GetRevisionStates implementations (BoxcutterRevisionStatesGetter and HelmRevisionStatesGetter) return nil as the error value when successful, not as the RevisionStates value. This inconsistency could cause issues where nil RevisionStates is returned on error but the error handling code might expect an empty slice instead. Consider returning an empty RevisionStates{} instead of nil for consistency with the type being a slice.
func (m *MockRevisionStatesGetter) GetRevisionStates(ctx context.Context, ext *ocv1.ClusterExtension) (controllers.RevisionStates, error) {
	if m.Err != nil {
		return nil, m.Err
	}
	return m.RevisionStates, nil

internal/operator-controller/controllers/clusterextensionrevision_controller_test.go:885

  • The test case slice for Test_ClusterExtensionRevisionReconciler_Reconcile_ProgressDeadline is empty (line 853: }{} ). This test function was likely meant to contain test cases for progress deadline functionality that was moved from ClusterExtensionRevision to ClusterExtension level. Either remove this empty test function or add appropriate test cases for any remaining revision-level progress deadline behavior.
func Test_ClusterExtensionRevisionReconciler_Reconcile_ProgressDeadline(t *testing.T) {
	const (
		clusterExtensionRevisionName = "test-ext-1"
	)

	testScheme := newScheme(t)
	require.NoError(t, corev1.AddToScheme(testScheme))

	for _, tc := range []struct {
		name            string
		existingObjs    func() []client.Object
		revisionResult  machinery.RevisionResult
		validate        func(*testing.T, client.Client)
		reconcileErr    error
		reconcileResult ctrl.Result
	}{} {
		t.Run(tc.name, func(t *testing.T) {
			// create extension and cluster extension
			testClient := fake.NewClientBuilder().
				WithScheme(testScheme).
				WithStatusSubresource(&ocv1.ClusterExtensionRevision{}).
				WithObjects(tc.existingObjs()...).
				Build()

			// reconcile cluster extension revision
			mockEngine := &mockRevisionEngine{
				reconcile: func(ctx context.Context, rev machinerytypes.Revision, opts ...machinerytypes.RevisionReconcileOption) (machinery.RevisionResult, error) {
					return tc.revisionResult, nil
				},
			}
			result, err := (&controllers.ClusterExtensionRevisionReconciler{
				Client:                testClient,
				RevisionEngineFactory: &mockRevisionEngineFactory{engine: mockEngine},
				TrackingCache: &mockTrackingCache{
					client: testClient,
				},
			}).Reconcile(t.Context(), ctrl.Request{
				NamespacedName: types.NamespacedName{
					Name: clusterExtensionRevisionName,
				},
			})
			require.Equal(t, tc.reconcileResult, result)
			require.Equal(t, tc.reconcileErr, err)

			tc.validate(t, testClient)
		})
	}
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 92 to 120
@@ -108,16 +111,16 @@ func Test_ClusterExtensionRevisionReconciler_Reconcile_RevisionReconciliation(t
Name: clusterExtensionRevisionName,
}, rev)
require.NoError(t, err)
cond := meta.FindStatusCondition(rev.Status.Conditions, ocv1.ClusterExtensionRevisionTypeAvailable)
cond := meta.FindStatusCondition(rev.Status.Conditions, ocv1.ClusterExtensionRevisionTypeReady)
require.NotNil(t, cond)
require.Equal(t, metav1.ConditionUnknown, cond.Status)
require.Equal(t, metav1.ConditionFalse, cond.Status)
require.Equal(t, ocv1.ClusterExtensionRevisionReasonReconciling, cond.Reason)
require.Equal(t, "some error", cond.Message)
require.Equal(t, int64(1), cond.ObservedGeneration)
},
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name says "Ready condition is updated to Unknown on error if its been already set" but the test validates that the condition status is ConditionFalse, not ConditionUnknown. This is inconsistent with the test name. Either update the test name to reflect that it's now False (not Unknown), or clarify the intended behavior in the test name.

Copilot uses AI. Check for mistakes.
Comment on lines +325 to +339
// Add the parent ClusterExtension's ServiceAccount to the watch list
ownerName, ok := rev.Labels[labels.OwnerNameKey]
if ok {
ext := &ocv1.ClusterExtension{}
if err := c.Client.Get(ctx, client.ObjectKey{Name: ownerName}, ext); err == nil {
// Watch the ServiceAccount referenced by the parent ClusterExtension
gvks.Insert(schema.GroupVersionKind{
Group: corev1.GroupName,
Version: "v1",
Kind: "ServiceAccount",
})
}
// If we can't get the ClusterExtension, continue without adding the ServiceAccount watch
// The revision will still watch its own objects
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ServiceAccount watch addition is overly broad and inefficient. This code unconditionally adds ServiceAccount to the watch list for ALL revisions that have a parent ClusterExtension, regardless of whether the actual bundle content references any ServiceAccounts. This will cause the revision controller to watch all ServiceAccount changes in the cluster, potentially leading to unnecessary reconciliations. Consider only adding the ServiceAccount watch when the bundle actually contains ServiceAccount resources, similar to how other GVKs are added from the phase objects.

Copilot uses AI. Check for mistakes.
Comment on lines +468 to +480
} else if isRollingOut {
// Rolling out - check if the latest active revision is stuck
// The latest revision is the one currently rolling out
if len(ext.Status.ActiveRevisions) > 0 {
latestRevStatus := ext.Status.ActiveRevisions[len(ext.Status.ActiveRevisions)-1]
readyCond := apimeta.FindStatusCondition(latestRevStatus.Conditions, ocv1.ClusterExtensionRevisionTypeReady)
if readyCond != nil && readyCond.Status == metav1.ConditionFalse {
// Latest revision is not ready - check how long it's been stuck
timeSinceLastTransition = time.Since(readyCond.LastTransitionTime.Time)
isStuck = true
}
}
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The progress deadline logic only checks the Ready condition's LastTransitionTime for the latest active revision, but doesn't account for revisions that might not have reached the active state yet. If a revision is stuck before becoming active (e.g., in validation or early rollout phases), the deadline check will not detect it because it only examines ext.Status.ActiveRevisions. Consider also checking revisions that are in the RollingOut state but not yet in ActiveRevisions to ensure comprehensive deadline enforcement.

Copilot uses AI. Check for mistakes.
Comment on lines +171 to +184
// Set Progressing condition based on revision states
// If there's an installed revision and no rolling revisions, we've reached the desired state
if state.revisionStates.Installed() != nil && len(state.revisionStates.RollingOut()) == 0 {
setStatusProgressing(ext, nil)
} else {
// There are rolling revisions, so we're still progressing
SetStatusCondition(&ext.Status.Conditions, metav1.Condition{
Type: ocv1.TypeProgressing,
Status: metav1.ConditionTrue,
Reason: ocv1.ReasonRollingOut,
Message: "Rolling out new revision",
ObservedGeneration: ext.GetGeneration(),
})
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Progressing condition logic may be incorrect when there are both installed and rolling out revisions. Lines 173-174 set Progressing to Succeeded when there's an installed revision and no rolling revisions. However, lines 176-184 set Progressing to RollingOut when there are rolling revisions, even if there's also an installed revision. This means the Progressing condition doesn't reflect the overall state accurately - it will show RollingOut even when there's a stable installed revision available. The logic should consider whether the rolling revision is newer than the installed one, and potentially maintain a Succeeded state if the installed revision is already at the desired state.

Copilot uses AI. Check for mistakes.
Comment on lines +142 to 164
// Mirror Ready condition from the installed revision
if i := state.revisionStates.Installed(); i != nil {
if cnd := apimeta.FindStatusCondition(i.Conditions, ocv1.ClusterExtensionRevisionTypeReady); cnd != nil {
cnd.ObservedGeneration = ext.GetGeneration()
apimeta.SetStatusCondition(&ext.Status.Conditions, *cnd)
}
ext.Status.Install = &ocv1.ClusterExtensionInstallStatus{
Bundle: i.BundleMetadata,
}
ext.Status.ActiveRevisions = []ocv1.RevisionStatus{{Name: i.RevisionName}}
}
for idx, r := range state.revisionStates.RollingOut {
for idx, r := range state.revisionStates.RollingOut() {
rs := ocv1.RevisionStatus{Name: r.RevisionName}
for _, cndType := range []string{ocv1.ClusterExtensionRevisionTypeAvailable, ocv1.ClusterExtensionRevisionTypeProgressing} {
if cnd := apimeta.FindStatusCondition(r.Conditions, cndType); cnd != nil {
cnd.ObservedGeneration = ext.GetGeneration()
apimeta.SetStatusCondition(&rs.Conditions, *cnd)
}
if cnd := apimeta.FindStatusCondition(r.Conditions, ocv1.ClusterExtensionRevisionTypeReady); cnd != nil {
cnd.ObservedGeneration = ext.GetGeneration()
apimeta.SetStatusCondition(&rs.Conditions, *cnd)
}
// Mirror Progressing condition from the latest active revision
if idx == len(state.revisionStates.RollingOut)-1 {
if pcnd := apimeta.FindStatusCondition(r.Conditions, ocv1.ClusterExtensionRevisionTypeProgressing); pcnd != nil {
pcnd.ObservedGeneration = ext.GetGeneration()
apimeta.SetStatusCondition(&ext.Status.Conditions, *pcnd)
// Mirror Ready condition from the latest active revision
if idx == len(state.revisionStates.RollingOut())-1 {
if rcnd := apimeta.FindStatusCondition(r.Conditions, ocv1.ClusterExtensionRevisionTypeReady); rcnd != nil {
rcnd.ObservedGeneration = ext.GetGeneration()
apimeta.SetStatusCondition(&ext.Status.Conditions, *rcnd)
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ready condition is being mirrored from both the installed revision (lines 143-147) and the latest rolling revision (lines 160-164), which could lead to conflicting condition updates. If there's an installed revision with Ready=True and a rolling revision with Ready=False, the final state will depend on the order of execution. Since the rolling revision logic executes after the installed revision logic, the ClusterExtension's Ready condition will reflect the rolling revision's state, which might be confusing to users who expect Ready=True to indicate the installed bundle is ready. Consider using a separate condition type or clarifying the semantics of which revision's condition should be mirrored to the ClusterExtension.

Copilot uses AI. Check for mistakes.
Comment on lines +526 to +533
func (rs RevisionStates) Installed() *RevisionMetadata {
for _, r := range rs {
if r.State == RevisionStateInstalled {
return r
}
}
return nil
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Installed() method returns the first revision with State == RevisionStateInstalled it finds in the slice (lines 527-530), but if multiple revisions could have this state (as identified in the previous issue with BoxcutterRevisionStatesGetter), this method will return an arbitrary one rather than the correct one. Since RevisionStates is documented as sorted ascending by revision number, consider either: 1) iterating in reverse to find the latest installed revision, or 2) ensuring only one revision can be in Installed state at a time through better state determination logic.

Copilot uses AI. Check for mistakes.
// - When status is False and reason is ProbeFailure, one or more objects are failing their readiness probes.
// - When status is False and reason is ValidationFailed, the revision failed preflight validation checks.
// - When status is False and reason is ObjectCollision, the revision encountered object ownership collisions.
// - When status is False and reason is ProgressDeadlineExceeded, the revision has not completed within the specified progress deadline.
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states "the revision has not completed within the specified progress deadline" for ProgressDeadlineExceeded reason, but the progressDeadlineMinutes field has been removed from ClusterExtensionRevision spec and moved to ClusterExtension level. This documentation should be updated to clarify that the progress deadline is now controlled at the ClusterExtension level, not the revision level.

Suggested change
// - When status is False and reason is ProgressDeadlineExceeded, the revision has not completed within the specified progress deadline.
// - When status is False and reason is ProgressDeadlineExceeded, the revision has not completed within the progress deadline configured on the parent ClusterExtension.

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +69
state := RevisionStateRollingOut
if apimeta.IsStatusConditionTrue(rev.Status.Conditions, ocv1.ClusterExtensionRevisionTypeReady) {
state = RevisionStateInstalled
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revision state determination logic only checks if Ready=True to mark a revision as Installed (line 67-68). However, this doesn't account for scenarios where multiple revisions might have Ready=True simultaneously, which could happen during a rollout where an old revision is still ready while a new one becomes ready. This could result in multiple revisions being marked as "Installed" state, which violates the expectation that only one revision should be installed at a time. Consider adding logic to identify which revision should be the singular "Installed" one based on additional criteria like revision number or whether it's actively serving traffic.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant