-
Notifications
You must be signed in to change notification settings - Fork 220
InformerEventSource throwing NPE if the watched resource is deleted #830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This seems to be an issues with informer in fabric8 client, will implement a quick fix on our side, and address it there. |
For now was not able to reproduce this locally, will try further. Putting @manusa @shawkins on cc, What happens that the new object is null on update: Is it expected in any situation that the new object is null? |
I think the question is why |
true, it should be delete called. |
Following the watcher logic (no idea if this is what happened on the Informer side), for regular deletes, the first step is to add the |
yes, but the old and new object pair should be still there, the new object will be in this case what is "marked for deletion". At least this is what I saw before. |
From the watcher perspective yes. But I'm not sure if the Informer might be merging events. |
ok, thank you!! |
It's not generally possible for the newObject to be null. The incoming event is processed by the reflector. There is an explicit null check there: https://github.com/fabric8io/kubernetes-client/blob/9e8ae39711143648158f555757296c613175c819/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/informers/cache/Reflector.java#L122 That same object reference will be passed all the way to your event handler with the stacktrace you are showing. The resourceVersion is read by the Watcher as well, so we know that the same call was not an NPE at this point: Since the resource is added to the store prior to the handler being called it is possible that other logic is modifying the object first - is it possible that the operator framework is setting the metadata to null? Or are you sure that the new object is null - that would be very surprising. |
Maybe we should split the line on which the NPE occurs so that we have a better idea of what is null? |
The issue still persists for me but with a little bit more debugging I believe I found the root cause: The thing is that I'm also creating or updating the Deployment in the reconciliation loop. And the loop is triggered by the Development being deleted, then created in the reconciliation loop, etc. Probably I'm just doing something wrong. :D But it's basically a race condition, happens only sometimes when the reconciliation is too fast. |
So this is happening due to modifications to the cached resource. It might not even be the operator logic making the modification. This was touched on in #3078 but not really addressed. Essentially the logic in createOrReplace, replace, or patch is allowed to modify the passed in object - in particular the resourceVersion. For example using an item directly from the cache in createOrReplace will set the resourceVersion to null - https://github.com/fabric8io/kubernetes-client/blob/32c6a88f029ba64b1f4225c5f122f2247b1b74d7/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/utils/CreateOrReplaceHelper.java#L46 I'm thinking we need fabric8 to not do this. To be safe we'll need to clone those objects before modifying: fabric8io/kubernetes-client#3756 It has also been discussed whether the objects obtained from the cache should already be cloned or write protected to prevent any future modifications - only cloning would be easy to implement. The only downside would be the general performance overhead - it's also possible to consider adding methods that would differentiate get vs getDirect. |
Thanks for the updates! |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. |
This issue was closed because it has been stalled for 14 days with no activity. |
Bug Report
What did you do?
What did you expect to see?
No NPE. ;)
What did you see instead? Under which circumstances?
Environment
Kubernetes cluster type:
vanilla
$ Mention java-operator-sdk version from pom.xml file
2.0.0-RC1 (commit: fd6e493)
$ java -version
openjdk version "11.0.12" 2021-07-20
OpenJDK Runtime Environment Homebrew (build 11.0.12+0)
OpenJDK 64-Bit Server VM Homebrew (build 11.0.12+0, mixed mode)
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:33:37Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.3", GitCommit:"c92036820499fedefec0f847e2054d824aea6cd1", GitTreeState:"clean", BuildDate:"2021-10-27T18:35:25Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"}
Possible Solution
Additional context
The text was updated successfully, but these errors were encountered: