Skip to content

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Jul 9, 2025

Optional keys of a list map (= associative lists) keys are simply left out of the set of keys, which is different from a key with an empty value like "" for a string and obviously also different from a non-empty value. The comparison of values already supported that and the comparison of list values supported lists with different number of entries.

Completely empty key field lists continue to trigger an error ("associative list with keys has an element that omits all key fields (and doesn't have default values for any key fields)".

Downgrading from a version which has support for a new optional key to a version which doesn't works as long as the optional key is not used, because the ManagedFields don't mention the new key and field and there are no list entries which have it set. It does not work when the new field and key are used because the older version doesn't know that it needs to consider the new key, as the key is not listed in the older version's OpenAPI spec.

This is considered acceptable because new fields will be alpha initially and downgrades with an alpha feature enabled are not required to work. It is worth calling out in release notes, though.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 9, 2025
@k8s-ci-robot k8s-ci-robot requested a review from cici37 July 9, 2025 10:15
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 9, 2025
@k8s-ci-robot k8s-ci-robot requested a review from yongruilin July 9, 2025 10:15
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 9, 2025
@pohly
Copy link
Contributor Author

pohly commented Jul 9, 2025

/assign @yongruilin @jpbetz

The current code is based on v4.1.0 because then I can use it in Kubernetes, see https://github.com/pohly/kubernetes/pull/new/apimachinery-list-map-keys. It conflicts with master regarding some imports in set.go and, more importantly, a version for master should use generics. Those are not possible with v4.7.0 because of the old Go version in go.mod.

I have it working like this in Kubernetes, but the TestEnsureNamedFieldsAreMembers in set_tests.go fails here. Somehow as it recursively descends, the schema and field path don't align like they do when the same is done in Kubernetes.

Note that the test is already fishy without my changes: the schema uses a key which isn't defined ("name"), and the path contains elements which don't exist in the schema ("a"). But perhaps I misunderstand something? Help definitely welcome!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pohly
Once this PR has been reviewed and has the lgtm label, please ask for approval from jpbetz. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

fieldpath/set.go Outdated
}
atom, _ := sc.Resolve(tr)
members.members = append(members.members, s.Members.members...)
if atom.List != nil {
Copy link
Contributor Author

@pohly pohly Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some debug printf here to figure out why the code works in Kubernetes but not in the unit test:

	fmt.Printf("Set: %s\nMembers: %s\nTypeRef: %+v\nAtom: %+v\n\n", s.String(), func() string {
		var members []string
		for _, member := range s.Members.members {
			members = append(members, member.String())
		}
		return strings.Join(members, ", ")
	}(), tr, atom)

In Kubernetes with dlv test ./test/integration/apimachinery/apply/ -- -test.v on my https://github.com/pohly/kubernetes/pull/new/apimachinery-list-map-keys branch, I get when I hit the new code:

Set: [a=1,b="x"]
[a=2,b="x"]
[a=1,b="x"].a
[a=1,b="x"].b
[a=1,b="x"].data
[a=2,b="x"].a
[a=2,b="x"].b
[a=2,b="x"].data
Members: [a=1,b="x"], [a=2,b="x"]
TypeRef: {NamedType:<nil> Inlined:{Scalar:<nil> List:0xc001a8d4a0 Map:<nil>} ElementRelationship:<nil>}
Atom: {Scalar:<nil> List:0xc001a8d4a0 Map:<nil>}

Note that List and Members are both non-empty.

When I do the same with the test's NewSet(_P("values", KeyByFields("keyAStr", "a", "keyBInt", 0), "value")),, I get instead:

Set: [keyAStr="a",keyBInt=0].value
Members: 
TypeRef: {NamedType:<nil> Inlined:{Scalar:<nil> List:0xc000124cd0 Map:<nil>} ElementRelationship:<nil>}
Atom: {Scalar:<nil> List:0xc000124cd0 Map:<nil>}

Note the empty members!

Interestingly, the Set.String result (first line after Set:) looks the same.

At this point my conclusion is that _P = MakePathOrDie simply doesn't produce a Path that is structured like the ones which occur when Kubernetes calls this package. Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather, NewSet(MakePath(...)) doesn't produce the right Set.

Copy link
Contributor

@jpbetz jpbetz Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. We could switch the tests to use typed.Parse and typed.ToFieldSet to avoid any future manual path construction mistakes in this code, but I don't think we need to do that in this PR. What you have in this PR looks correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that tests aren't passing... Are you suggesting to disable the testing of this for now to unblock the PR, or to switch to a different construction method for the set? I need to look into what typed.Parse and typed.ToFieldSet do.

Also, if NewSet(MakePath(...)) is broken, what does that mean for the rest of the code, independent of this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I figured out what you were suggesting with typed.Parse and typed.ToFieldSet:

  • convert the schema with typed.NewParser
  • create a value from YAML/JSON with parser.Type(typeName).FromYAML
  • create the input set with ToFieldSet

The test is now using this approach. I had to move it into the fieldpath_test package to avoid the import cycle.

What you have in this PR looks correct.

Looks can be deceiving 🥵

With the code as-is, I got missing keys backfilled only in some paths in the set:

        set_test.go:868: expected after EnsureNamedFieldsAreMembers:
            .values
            .values[keyAStr="a",keyBInt=null]
            .values[keyAStr="a",keyBInt=null].keyAStr
            .values[keyAStr="a",keyBInt=null].value
            
            got:
            .values
            .values[keyAStr="a",keyBInt=null]
            .values[keyAStr="a"].keyAStr
            .values[keyAStr="a"].value
            
            missing:
            .values[keyAStr="a",keyBInt=null].keyAStr
            .values[keyAStr="a",keyBInt=null].value
            
            superfluous:
            
            .values[keyAStr="a"].keyAStr
            .values[keyAStr="a"].value

The paths added by EnsureNamedFieldsAreMembers lacked the keyBInt=null. This didn't affect my integration-level tests in Kubernetes, but it didn't look right.

My solution is to also backfill in SetNodeMap.EnsureNamedFieldsAreMembers. This feels a bit redundant because missing keys get inserted multiple times (at different levels, for different children), but I don't see a good way to avoid that without making assumptions about which path keys are shared.

Unit tests are passing now. I also verified that the integration tests in https://github.com/pohly/kubernetes/pull/new/apimachinery-list-map-keys continue to pass.

As far as I am concerned, this is ready to be released, but please check that I didn't miss anything and the update describe above makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think we should backfill at all, both to simplify skew concerns with older API servers handling modified managedFields entries, and so we don't make existing data persist differently

Copy link
Contributor

@jpbetz jpbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks correct. Once TODOs are cleared and linters are appeased, I'm OK to merge.

Are we okay with treating this as new functionality from a semver perspective and publishing it out as a minor version bump? I'd prefer not to have to publish a new major for this.

fieldpath/set.go Outdated
}
atom, _ := sc.Resolve(tr)
members.members = append(members.members, s.Members.members...)
if atom.List != nil {
Copy link
Contributor

@jpbetz jpbetz Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. We could switch the tests to use typed.Parse and typed.ToFieldSet to avoid any future manual path construction mistakes in this code, but I don't think we need to do that in this PR. What you have in this PR looks correct.

fieldpath/set.go Outdated
Comment on lines 139 to 149
// PathElement.Key is sorted alphabetically. We can use that for
// a fast lookup with binary search and, if not found, must insert
// at the indicated index.
//
// TODO: On master with more recent Go, switch to these generics:
// index, found := slices.BinarySearchFunc(keys, key, func(field value.Field, fieldName string) int {
// return strings.Compare(field.Name, fieldName)
// })
// if !found {
// keys = slices.Insert(keys, index, value.Field{Name: key, Value: value.NewValueInterface(nil)})
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know a scan is fine for built-in types (never more than two or three keys AFAIK). I suppose a CRD MIGHT have more in theory, but I doubt that actually happens in practice, so while I'm fine with binary search, but I don't see it as being particularly essential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. It's primarily to avoid reimplementing something that exists in the standard library, not so much about performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved while rebasing.

@jpbetz
Copy link
Contributor

jpbetz commented Jul 16, 2025

The current code is based on v4.1.0 because then I can use it in Kubernetes, see pohly/kubernetes/pull/new/apimachinery-list-map-keys. It conflicts with master regarding some imports in set.go and, more importantly, a version for master should use generics. Those are not possible with v4.7.0 because of the old Go version in go.mod.

ACK. Is there anything we need to do to this repo to unblock?

@pohly pohly force-pushed the optional-map-keys branch from 8067b74 to 87af414 Compare July 16, 2025 09:31
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 16, 2025
pohly added a commit to pohly/kubernetes that referenced this pull request Jul 16, 2025
@pohly
Copy link
Contributor Author

pohly commented Jul 16, 2025

The current code is based on v4.1.0 because then I can use it in Kubernetes, see pohly/kubernetes/pull/new/apimachinery-list-map-keys.

ACK. Is there anything we need to do to this repo to unblock?

I could rebase because Kubernetes is using v6 now, so I can test there with a PR based on this repo's master.

@pohly pohly changed the title WIP: support optional map keys support optional map keys Jul 16, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 16, 2025
typed/helpers.go Outdated
return pe, fmt.Errorf("associative list with keys has an element that omits key field %q (and doesn't have default value)", fieldName)
// The null value compares equal to null and not equal to any non-null value (see CompareUsing),
// so it is a valid key value for a key map.
keyMap = append(keyMap, value.Field{Name: fieldName, Value: value.NewValueInterface(nil)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to have a nullable: true item which is used as a key field?

I don't love that this removes the distinction between "absent" and "present and nullable and null"

I also don't like how this changes the calculated managedFields entries for existing data in ways older servers likely won't understand.

If we simply make this branch a no-op instead of an error case, and fix the keyMap comparison so that any two items that don't have identical keyMap Name entries automatically are considered to not match, does that work?

fieldpath/set.go Outdated
}

// EnsureNamedFieldsAreMembers returns a set that contains all the named fields along with the leaves.
// Missing keys in list maps are backfilled with null values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think we should add explicit null values for missing keys, but should fix comparisons instead to treat non-identical key sets as not equal. That will probably require changing the docs and maybe name of this method to something more like EnsureExistingNamedFieldsAreMembers and making sure callers are ok with subsets of the named fields

if i > 0 {
first := true
for _, field := range *pe.Key {
if field.Value.IsNull() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we stop adding synthetic nulls, I think this code goes away

fieldpath/set.go Outdated
// included. For example, a set made of "a.b.c" will end-up also owning
// "a" if it's a named fields but not "a.b" if it's a map.
//
// Missing keys in list maps are backfilled with null values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's stop adding synthetic nulls everywhere and just focus on making sure callers / comparisons correctly handle non-identical key entries in sets

fieldpath/set.go Outdated
}
atom, _ := sc.Resolve(tr)
members.members = append(members.members, s.Members.members...)
if atom.List != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think we should backfill at all, both to simplify skew concerns with older API servers handling modified managedFields entries, and so we don't make existing data persist differently

list:
elementRelationShip: associative
keys: ["name"]
elementRelationship: associative
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how much of the test changes came from fixing this typo and how many were optional additions / modifications made? ideally, we'd fix bugs in the test separately, and avoid optional modifications concurrent with changing the behavior being tested

// in the associate list.
}
}
keyMap.Sort()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be a good idea to make sure we can't end up with a completely empty keyMap by doing a check like this after the loop:

	if len(list.Keys) > 0 && len(keyMap) == 0 {
		return pe, fmt.Errorf("associative list with keys has an element that omits all key fields %q (and doesn't have default values for any key fields)", list.Keys)
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, will be in next update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed.

@jpbetz
Copy link
Contributor

jpbetz commented Jul 16, 2025

Can we also add a test like https://gist.github.com/jpbetz/87c5c9b5a244741be5100dc77bfe595f to merge/schema_change_test.go?

@pohly pohly force-pushed the optional-map-keys branch 2 times, most recently from 3940bba to cd2b883 Compare July 16, 2025 17:18
@jpbetz
Copy link
Contributor

jpbetz commented Jul 16, 2025

Can we also add a test like gist.github.com/jpbetz/87c5c9b5a244741be5100dc77bfe595f to merge/schema_change_test.go?

Nevermind, I'll merge the tests I want separately after this PR merges via: #298 (they pass when stacked on this PR)

@liggitt
Copy link
Contributor

liggitt commented Jul 16, 2025

Testing in kubernetes/kubernetes#132998 ahead of merge

Optional keys of a list map (= associative lists) keys are simply left out
of the set of keys, which is different from a
key with an empty value like "" for a string and obviously also different from
a non-empty value. The comparison of values already supported that and the
comparison of list values supported lists with different number of entries.

Completely empty key field lists continue to trigger an error ("associative
list with keys has an element that omits all key fields <quoted list of
fields> (and doesn't have default values for any key fields)".

Downgrading from a version which has support for a new optional key to a
version which doesn't works as long as the optional key is not used, because
the ManagedFields don't mention the new key and field and there are no list
entries which have it set. It does not work when the new field and key are
used because the older version doesn't know that it needs to consider the new
key, as the key is not listed in the older version's OpenAPI spec.

This is considered acceptable because new fields will be alpha initially and
downgrades with an alpha feature enabled are not required to work. It is worth
calling out in release notes, though.
@pohly pohly force-pushed the optional-map-keys branch from cd2b883 to bc4c51f Compare July 16, 2025 20:20
@liggitt
Copy link
Contributor

liggitt commented Jul 16, 2025

merged in #298

/close

@k8s-ci-robot
Copy link
Contributor

@liggitt: Closed this PR.

In response to this:

merged in #298

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants