Skip to content

Conversation

vr4manta
Copy link
Contributor

@vr4manta vr4manta commented Sep 18, 2025

SPLAT-2206

Changes

  • Added static Dedicated Host support for AWS machines
  • Updated feature gate owner to rvanderp3 and component to splat

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 18, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 18, 2025

@vr4manta: This pull request references SPLAT-2206 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

SPLAT-2206

Changes

  • Added static Dedicated Host support for AWS machines
  • Updated feature gate owner to rvanderp3 and component to splat

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 18, 2025

@vr4manta: This pull request references SPLAT-2206 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

SPLAT-2206

Changes

  • Added static Dedicated Host support for AWS machines
  • Updated feature gate owner to rvanderp3 and component to splat

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Sep 18, 2025

Hello @vr4manta! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 18, 2025
@JoelSpeed
Copy link
Contributor

Does this API already exist upstream in CAPA?

@openshift-ci openshift-ci bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 18, 2025
@vr4manta
Copy link
Contributor Author

Does this API already exist upstream in CAPA?

@JoelSpeed Yes, this is already merged and pulled into OpenShift. Working on just the static version since dynamic is not finished upstream.

@everettraven
Copy link
Contributor

/assign

@vr4manta vr4manta changed the title SPLAT-2206: Added AWS dedicated host support [WIP] SPLAT-2206: Added AWS dedicated host support Sep 19, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 19, 2025
Copy link
Contributor

openshift-ci bot commented Sep 19, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from everettraven. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vr4manta vr4manta force-pushed the SPLAT-2206 branch 2 times, most recently from 0fcff1c to b088b27 Compare September 19, 2025 13:33
// +kubebuilder:validation:MaxLength=19
// +openshift:enable:FeatureGate=AWSDedicatedHosts
// +optional
HostID *string `json:"hostID,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between setting this to "" and omitting the field entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no difference. I would assume this field is not set if user not intending to place instances into a dedicated host.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no difference, this should not be a pointer and should have a minimum length of 1. This is probably what the linter is complaining about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is validated by Go based webhooks, and not openapi, the linter is wrong on this one.

If we make this not a pointer, then the Go code has no way to know if this was deliberately set to "" or not. We don't want "" to be valid, so this needs to be a pointer so that we can check that.

In this case (and future cases like this in these providerspec APIs) we will want to make exceptions to the serialization rules on the linter.

We may want to even disable the serialization rules on these particular APIs somehow 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I went into standard API review mode here and forgot this API is webhook validation 🤦

Thanks for catching that!

We may want to even disable the serialization rules on these particular APIs somehow

Can we do this via codegen configurations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this via codegen configurations?

No, but we should be able to disable using the .golangci-lint.yaml config, ideally we could have a different config for the APIs that act like this, these MAPI ones aren't the only ones (e.g. the aggregated APIs we support too)

// hostAffinity specifies the dedicated host affinity setting for the instance.
// When HostAffinity is set to host, an instance started onto a specific host always restarts on the same host if stopped.
// When HostAffinity is set to default, and you stop and restart the instance, it can be restarted on any available host.
// When HostAffinity is defined, HostID is required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether it is set to host or default, hostID will be required?

Also, when referring to a field use the serialized form of the field name as that is what end-users would be familiar with:

Suggested change
// When HostAffinity is defined, HostID is required.
// When HostAffinity is defined, hostID is required.

Copy link
Contributor Author

@vr4manta vr4manta Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if hostID is set, this field will be could required. The idea was that if they do not set this field when hostID is set, it will default to host behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, hostID is required when hostAffinity is set to host and should be forbidden otherwise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I'll make sure this godoc states this. I have the webhook logic doing this already.

Comment on lines 119 to 127
// hostAffinity specifies the dedicated host affinity setting for the instance.
// When HostAffinity is set to host, an instance started onto a specific host always restarts on the same host if stopped.
// When HostAffinity is set to default, and you stop and restart the instance, it can be restarted on any available host.
// When HostAffinity is defined, HostID is required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For enum fields we generally try to follow the pattern of:

hostAffinity ...
Allowed values are Host, Default, and omitted.
When set to Host, ...
When set to Default, ...
When omitted, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i'll make this godoc update

Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple small comments.

May have more pending the results of discussions on what the appropriate behaviors are when set to Host and AnyAvailable.

@vr4manta vr4manta changed the title [WIP] SPLAT-2206: Added AWS dedicated host support SPLAT-2206: Added AWS dedicated host support Oct 2, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 2, 2025
Comment on lines 111 to 131
// hostID specifies the Dedicated Host on which the instance must be started.
// This field is mutually exclusive with DynamicHostAllocation.
// When set, the value must be a valid AWS Dedicated Host ID in the form
// "h-" followed by 17 lowercase hexadecimal characters.
// The maximum length is 19 characters, and the field may be omitted.
// +kubebuilder:validation:XValidation:rule="self == null || self.matches('^h-[0-9a-f]{17}$')",message="hostID must start with 'h-' and end in 17 alphanumeric characters"
// +kubebuilder:validation:MaxLength=19
// +openshift:enable:FeatureGate=AWSDedicatedHosts
// +optional
HostID *string `json:"hostID,omitempty"`

// hostAffinity specifies the dedicated host affinity setting for the instance.
// Valid values are "AnyAvailable", "Host", and omitted.
// When HostAffinity is set to "Host", an instance started onto a specific host always restarts on the same host if stopped.
// When HostAffinity is set to "AnyAvailable", and you stop and restart the instance, it can be restarted on any available host.
// When HostAffinity is omitted and HostID is defined, the instance is started onto the specified host.
// When HostAffinity is defined, HostID is required.
// +kubebuilder:validation:MaxLength=64
// +openshift:enable:FeatureGate=AWSDedicatedHosts
// +optional
HostAffinity *HostAffinity `json:"hostAffinity,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I look at this, the more I wonder if this should be nested another level and follow the discriminated union pattern.

It might make it easier to have configuration options be like:

...
dedicatedHost:
  affinity: Host
  host:
    id: h-017afcd

and

...
dedicatedHost:
  affinity: AnyAvailable

Then we can enforce requirements like dedicateHost.host.id being required when setting dedicatedHost.affinity to Host and forbidding it otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I discussed with @rvanderp3 and we can make it this way.

@vr4manta vr4manta force-pushed the SPLAT-2206 branch 2 times, most recently from 06d98ae to 9589e77 Compare October 7, 2025 16:16
Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the change to a discriminated union - I definitely like this direction better!

// - Host: the instance must run on a specific host; set host.hostID.
// - omitted: if host.hostID is set, the instance runs on that specific host; otherwise no host constraint is applied.
// When hostAffinity is set, host.hostID is required for "Host" and must be omitted for "AnyAvailable".
// +kubebuilder:validation:MaxLength=64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discriminants are enums and do not need a MaxLength. On the HostAffinity Go type alias, add +kubebuilder:validation:Enum=AnyAvailable;Host and remove the +kubebuilder:validation:MaxLength marker from this set of markers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can, but i was hitting error on this before where it was requiring it. I"ll try again and see how it goes.

Comment on lines 370 to 374
// Valid values are "AnyAvailable", "Host", and omitted.
// - AnyAvailable: the platform selects any available dedicated host; do not set host.
// - Host: the instance must run on a specific host; set host.hostID.
// - omitted: if host.hostID is set, the instance runs on that specific host; otherwise no host constraint is applied.
// When hostAffinity is set, host.hostID is required for "Host" and must be omitted for "AnyAvailable".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a general pattern we try to follow for consistency across OpenShift APIs.

Could we update this to look like this?:

hostAffinity selects how the instance is placed on a dedicate host.
Allowed values are AnyAvailable and Host.
When set to AnyAvailable, the platform selects any available dedicated host.
When set to Host, the instance must run on a specific host denoted by host.hostID.
host is required when hostAffinity is set to Host, and forbidden otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i can remove formatting and use sentences.

// hostAffinity selects how the instance is placed on a dedicated host.
// Valid values are "AnyAvailable", "Host", and omitted.
// - AnyAvailable: the platform selects any available dedicated host; do not set host.
// - Host: the instance must run on a specific host; set host.hostID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if Host makes the most sense here or if it is to stutter-y.

As an example:

dedicatedHost:
  hostAffinity: Host
  host:
    hostID: h-1326af

That has "host" 5 times in 4 lines.

Maybe something like:

dedicatedHost:
  affinity: Specific
  specific:
    id: h-1326af

is better here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well above we said remove host from affinity, but i do not like the "specific" being used. That feels very odd. So this goes back to not doing discriminating union due to how this is now looking more complex. With this, affinity states if it is to be assigned to a dedicated host and then we just need a field to specify the host ID. Currently we do not have other info for the host to provide so it may start to feel like overkill.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe taking a step back here would be helpful.

Maybe I misunderstood, but my interpretation was that the original hostAffinity value was meant to only be used with dedicated hosts. Is that incorrect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you understand correctly. The complexity is coming from the discriminating union additions, but I just want to find verbage that makes sense and is clear. So allow me to explain history and ideas.

In the aws API (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/moving-instances-dedicated-hosts.html) it states the placement logic for an instance. Its confusing there too. Here is an example:

aws ec2 modify-instance-placement \
    --instance-id i-1234567890abcdef0 \
    --affinity host \
    --tenancy host \
    --host-id h-012a3456b7890cdef

So what we are adding / added in CAPA upstream was this: kubernetes-sigs/cluster-api-provider-aws#5548

So for MAPI, we are adding just enough so we can do the mapi2capi conversion logic so we can create these resources. Ideally I was hoping to keep our API similar for simplicity (the original concept at beginning of PR), but I am happy to have it follow the OCP ideals (I did similar things for CAPV stuff for multi disk and others). So with all of this, I do like the idea of using the discriminating union pattern. Building upon our ideas with thoughts on whats next to come might be to do a combination of our ideas with the following:

[Any Host Placement]

hostPlacement:
  hostAffinity: AnyAvailable

[Static Placement]

hostPlacement:
  hostAffinity: DedicatedHost
  DedicatedHost:
    id: h-abcdef0123456789a

[Dynamic Host Placement]

hostPlacement:
  hostAffinity: DynamicHost
  DynamicHost:
    tags:
    - app: myApp
    - department: whatever

Maybe this is more in line with what you are thinking. I know @rvanderp3 is working on adding dynamic host support upstream at the moment and that will require some additional data or changed fields. Maybe this better sets us up for that.

What are your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications here - with more context what you've shared makes a lot of sense to me. Let's go with what you've proposed (presumably dropping the DynamicHost support until that is added upstream).

Only changes I'd suggest here are:

  • hostAffinity -> affinity
  • DedicatedHost -> Dedicated
  • In the future, DynamicHost -> Dynamic

Because it is already nested under hostPlacement dropping "host" from those field names should still logically make sense and reduce the repetition of the "host" term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, i'll mock up those changes.

Comment on lines 380 to 419
// host specifies a particular dedicated host when required by hostAffinity or when
// hostAffinity is omitted and you want to target a specific host.
// Must be omitted when hostAffinity is "AnyAvailable".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want something more like:

host specifies the exact host that an instance should run on.
host is required when hostAffinity is set to Host, and forbidden otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll make the change.

@vr4manta vr4manta force-pushed the SPLAT-2206 branch 4 times, most recently from c36f492 to c503f95 Compare October 14, 2025 16:19
Affinity *HostAffinity `json:"affinity,omitempty"`

// dedicated specifies a particular dedicated host when required by affinity set to DedicatedHost.
// Must be omitted when hostAffinity is "AnyAvailable".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer to use the terminology:

dedicated is required when 'affinity' is set to DedicatedHost, and forbidden otherwise.

Additionally, we should add CEL validation that enforces this behavior.
As an example, see

// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'RequiredMember' ? has(self.requiredMember) : !has(self.requiredMember)",message="requiredMember is required when type is RequiredMember, and forbidden otherwise"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the parent to have this information in addition to the tags mentioned in the example for the discriminator and members.

Copy link
Contributor

openshift-ci bot commented Oct 15, 2025

@vr4manta: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

// +union
type HostPlacement struct {
// affinity specifies the affinity setting for the instance.
// Allowed values are AnyAvailable and DedicatedHost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: consistency for ending of a sentence

Suggested change
// Allowed values are AnyAvailable and DedicatedHost
// Allowed values are AnyAvailable and DedicatedHost.

// When Affinity is set to AnyAvailable, and you stop and restart the instance, it can be restarted on any available host.
// +required
// +unionDiscriminator
Affinity HostAffinity `json:"affinity,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is validated in a go webhook, do you need this to be a pointer so you can explicitly distinguish between not set and intentionally set to the empty string value ("") and return the appropriate field error (i.e required vs invalid value)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants