Skip to content

Conversation

aleksandra-malinowska
Copy link
Contributor

Delta snapshot stores a chain of changes-since-last-fork (although forking more than once is illegal for now, following basic snapshot's implementation). It's optimized for frequent fork-add a bit of stuff-run lots of predicates-revert flow by:

  1. copy-on-write - Fork() and Revert() are O(1), at cost of O(n) Commit() and copying NodeInfo objects on some add/remove operations. In the pessimistic case where all the original nodes are modified (e.g. by AddPod()) after Fork(), total copying cost reaches that of basic snapshot's O(n) Fork().
  2. caching of frequently obtained lists of all nodes and pods.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 5, 2020
@k8s-ci-robot k8s-ci-robot requested a review from Jeffwan February 5, 2020 14:43
@aleksandra-malinowska aleksandra-malinowska force-pushed the delta-snapshot-4 branch 3 times, most recently from 56a54ed to 6e82884 Compare February 5, 2020 17:17
@MaciekPytel
Copy link
Contributor

Can you remove the final commit (the one marked [DO NOT MERGE]) from this PR? I get what this commit is doing and why it's important for benchmarking, but we do not want to merge it :)

@aleksandra-malinowska
Copy link
Contributor Author

Can you remove the final commit (the one marked [DO NOT MERGE]) from this PR? I get what this commit is doing and why it's important for benchmarking, but we do not want to merge it :)

Done.

scheduler_listers "k8s.io/kubernetes/pkg/scheduler/listers"
scheduler_nodeinfo "k8s.io/kubernetes/pkg/scheduler/nodeinfo"
scheduler_volumebinder "k8s.io/kubernetes/pkg/scheduler/volumebinder"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why add newline here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-formatted on save ¯_(ツ)_/¯

GetAllNodes() ([]*apiv1.Node, error)
GetAllPods() []*apiv1.Pod
// GetAllNodes returns list of all the nodes in snapshot
GetAllNodes() []*schedulernodeinfo.NodeInfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that duplicate .NodeInfos().List(), which snapshot already exposes? I understand how version returning Nodes is different, but this one not so much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mostly a relict of original GetSchedulerListers() in the interface, which required error-checking twice before the list could be used. I think it's still nice to have a version without return error value enforced by scheduler interfaces, but I don't feel very strongly about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historical context makes sense, but I'd rather not have 2 getters for the same thing.
I may no longer be a python developer, but I still subscribe to 'there should be one (and preferably only one) obvious way to do it' part of zen of python :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed GetAllNodes for consistency, based on 'special cases aren’t special enough to break the rules' - we'd only need this call in one place, where a more generic NodeInfo can also be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also removed GetAllPods, since the same argument about two getters works applies to it.

assert.NoError(b, err)
}

for _, pod := range scheduledPods {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that scheduledPods is always empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was:( Fixed.

b.ResetTimer()

for i := 0; i < b.N; i++ {
if stillPending, err := filterOutSchedulableByPacking(pendingPods, clusterSnapshot, predicateChecker, 10); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you reset snapshot between runs (technically I guess fork+revert around filterOutSchedulable calls)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename things to clarify this, but all of the pending pods are unschedulable in this scenario (too big for the existing nodes).

Fork() is problematic because for basic snapshot, it can dominate the cost (I've tried using b.StopTimer() and b.StartTimer(), but they seem to hang if the cost of excluded operation is too big). In real use case, we wouldn't be forking at this point, so including it in measurement doesn't make much sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense if all pods are unschedulable. Maybe add a comment to make this part more obvious?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

assert.NoError(b, err)
} else {
if len(stillPending) < tc.pendingPods {
assert.Equal(b, len(stillPending), tc.pendingPods)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even possible for this assertion not to fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above - all pending pods in this benchmark are unschedulable (request 10x of the node type's memory).


for i := 0; i < b.N; i++ {
if stillPending, err := filterOutSchedulableByPacking(pendingPods, clusterSnapshot, predicateChecker, 10); err != nil {
assert.NoError(b, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a strange way of failing the test? If so please just use b.Error()/b.Fatal() directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very strange way of maintaining sanity checks (ignoring errors can cause the benchmark to succeed and report super-fast times if the test setup is faulty), while not defeating the purpose of the benchmarks (for some of the faster benchmarks, always doing assert.NoError seemed to increase the reported time per operation by 5x or more).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind explicit checks, but in this case I'd prefer calling b.Fatal() explicitly. It's much more obvious than a doomed-to-fail assertion.

Copy link
Contributor Author

@aleksandra-malinowska aleksandra-malinowska Feb 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b.Error()/b.Fatal() don't really print a nice message when passed the original error.

b.Error(fmt.Errorf("hello")):

    basic_cluster_snapshot_test.go:81: hello

assert.NoError(b, fmt.Errorf("hello")):

    basic_cluster_snapshot_test.go:81: 
        	Error Trace:	basic_cluster_snapshot_test.go:81
        	            				benchmark.go:190
        	            				benchmark.go:230
        	            				asm_amd64.s:1357
        	Error:      	Received unexpected error:
        	            	hello
        	Test:       	BenchmarkAddNodes/basic:_AddNode()_5000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel particularly strongly about this, other than that assert.NoError seems to work well for this case (and names the scenario which failed).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair point about nicer stack trace. I would just include scenario name and all relevant data in Fatalf(), but this works too.

// state of the cluster. Removes from the map nodes that are no longer in the
// nodes list.
func (sd *ScaleDown) updateUnremovableNodes(nodes []*apiv1.Node) {
func (sd *ScaleDown) updateUnremovableNodes(nodes []*schedulernodeinfo.NodeInfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's probably not super important I think passing Node was more elegant than NodeInfo. The way I see it NodeInfo is a Node + list of pods, if we don't care about pods we should use Node directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning list of Nodes requires snapshot to either build it on demand each time, or maintain two cached lists. It seemed simpler to just reuse NodeInfos here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind building on demand - the cost should be negligible unless we're calling it in the middle of a loop. It feels like a very minor implementation inconvenience and I think using nodes is more elegant (encapsulate NodeInfos from pieces of code that don't need to know about them), but as I said I don't think it's critical. So I can let this be if you feel it's a better way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, got rid of GetAllNodes method

// state of the cluster. Removes from the map nodes that are no longer in the
// nodes list.
func (sd *ScaleDown) updateUnremovableNodes(nodes []*apiv1.Node) {
func (sd *ScaleDown) updateUnremovableNodes(nodes []*schedulernodeinfo.NodeInfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind building on demand - the cost should be negligible unless we're calling it in the middle of a loop. It feels like a very minor implementation inconvenience and I think using nodes is more elegant (encapsulate NodeInfos from pieces of code that don't need to know about them), but as I said I don't think it's critical. So I can let this be if you feel it's a better way.

limitations under the License.
*/

package simulator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comments related to this file:

  1. Can we rename it, given that it seems to tests both basic and delta to the same extent?
  2. I'd like some more unittests (not only benchmarks).
  3. Maybe it would be worth moving benchmarks and correctness unittests to separate files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Done
  2. Not yet done
  3. Done

})
}
}
for snapshotName, snapshotFactory := range snapshots {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be an exact copy-paste of loop above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It compares AddNode() vs AddNodes(). Originally I was trying to speed up adding nodes in batch, but it turned out to have negligible impact.

func BenchmarkForkAddRevert(b *testing.B) {
nodeTestCases := []int{1, 10, 100, 1000, 5000, 15000, 100000}
podTestCases := []int{0, 1, 30}
snapshots := map[string]snapshotFactory{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Any reason why this couldn't be done once on module level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}

type snapshotFactory func() ClusterSnapshot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if this was defined before it's used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

_ = clusterSnapshot.AddNode(node)
}
forkNodes := clusterSnapshot.GetAllNodes()
assert.Equal(t, nodeCount+len(extraNodes), len(forkNodes))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine as a sanity check in benchmarks, but in correctness tests I'd prefer to compare list contents too and not just check length.

t.Run(fmt.Sprintf("%s: fork should not affect base data: adding nodes", name),
func(t *testing.T) {
clusterSnapshot := snapshotFactory()
_ = clusterSnapshot.AddNodes(nodes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert.NoError() on all relevant calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


func BenchmarkBuildNodeInfoList(b *testing.B) {
testCases := []struct {
nodeCount int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why struct with a single int?

for _, tc := range testCases {
b.Run(fmt.Sprintf("fork add 1000 to %d", tc.nodeCount), func(b *testing.B) {
nodes := createTestNodes(tc.nodeCount + 1000)
snapshot := NewDeltaClusterSnapshot()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not run that for both implementations like other benchmarks?

}
})
}
for _, tc := range testCases {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the shared part is minimal - maybe split into two functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// commit - O(n)
// list all pods (no filtering) - O(n), cached
// list all pods (with filtering) - O(n)
// list node infos - O(n), cached
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wish I'd get to write code where big-O considerations comes into play :(

havePodsWithAffinity []*schedulernodeinfo.NodeInfo
}

var errNodeNotFound = fmt.Errorf("node not found")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I tend to use errors.New() if there's no formatting (on the presumption that Somethingf() should not be used without formatting).

var nodeInfoList []*schedulernodeinfo.NodeInfo

if len(data.deletedNodeInfos) > 0 {
nodeInfoList = make([]*schedulernodeinfo.NodeInfo, 0, totalLen+100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why +100? I don't see how the result could ever be longer than totalLen, so I'm guessing it's to optimize future addNode() calls? If so maybe make 100 a const and add a comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping.
Also: If I'm guessing right this feels premature - amortized cost of add should hopefully already not be much in general case. Especially in this particular case, as any cluster likely to have 100 nodes expansion option will already have totalLen large enough that at most you should see a single re-alloc. Have you actually seen any impact of this (I don't mind keeping it as memory cost is negligible, I'm just curious)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this for now. Will add this in another PR with benchmark, if it turns out it matters.

}
nodeInfoList = append(nodeInfoList, bni)
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really worth having a separate else branch here? Assuming lookups in empty map are reasonably cheap, it seems like we could get away with always using the first code branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With copy():
BenchmarkBuildNodeInfoList/fork_add_1000_to_1000-12 17793 33766 ns/op
BenchmarkBuildNodeInfoList/fork_add_1000_to_5000-12 13588 43650 ns/op
BenchmarkBuildNodeInfoList/fork_add_1000_to_15000-12 10000 75878 ns/op
BenchmarkBuildNodeInfoList/fork_add_1000_to_100000-12 2824 369051 ns/op

Without (always using the first branch):
BenchmarkBuildNodeInfoList/fork_add_1000_to_1000-12 8946 61269 ns/op
BenchmarkBuildNodeInfoList/fork_add_1000_to_5000-12 3625 157285 ns/op
BenchmarkBuildNodeInfoList/fork_add_1000_to_15000-12 840 686720 ns/op
BenchmarkBuildNodeInfoList/fork_add_1000_to_100000-12 54 9803347 ns/op

Multiply by number of node groups considered in scale-up simulation for real-world usage.

}

for _, dni := range data.nodeInfoMap {
nodeInfoList = append(nodeInfoList, dni)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need some deduplication somewhere? It seems to me that we'll end up with duplicate nodeInfo for any nodeInfo that was modified in fork (had a pod scheduled on it). Am I missing something obvious here?

If I'm right please also add a test case that would detect this.


func (data *internalDeltaSnapshotData) addNodeInfo(nodeInfo *schedulernodeinfo.NodeInfo) error {
if _, found := data.nodeInfoMap[nodeInfo.Node().Name]; found {
return fmt.Errorf("node %s already in snapshot", nodeInfo.Node().Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if node was already present in base snapshot? Seems like it will be just silently overridden, which is inconsistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


if !foundInBase && !foundInDelta {
// Node not found in the chain.
return errNodeNotFound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we even want to treat this as an error?
Context: #2709 (review).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. It feels like it should depend on our use of this method, but our current use of it is "never remove a node from snapshot", since scale down simulations don't yet use packing in snapshot. Can we resolve it when we migrate that part of the code?

chunkCount := len(data.nodeInfoMap) + len(basePodChunks)
podChunks := make([][]*apiv1.Pod, chunkCount, chunkCount)
copy(podChunks, basePodChunks)
for _, node := range data.nodeInfoMap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again seems like some sort of deduplication is missing here. I think you're including pods from nodes that exist in base snapshot, but were deleted in fork. And also double-counting pods on nodeinfos that were modified in fork.

return podList
}

func (data *internalDeltaSnapshotData) getAllNodes() []*apiv1.Node {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is no longer used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

limitations under the License.
*/

package simulator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels like very limited coverage, especially for such a complex and easily-testable (isolated) data structure.
Some tests cases that would be nice to cover (I don't claim it's a complete list, just stuff I was able to list in 2 minutes):

  • Add pods, fork, add pods (current test cases don't mix pods added in fork and in base)
  • Add node with pods in fork (basic use-case in CA)
  • Remove nodes in fork
  • Add / remove pods outside of fork
  • Testing HavePodsWithAffinityList()
    • Including verifying that cache is correctly invalidated when needed
  • Test for Clear() (including when forked)
  • Maybe a basic test for FilteredList()
  • Modify NI and later delete it (ex. remove pod / delete node) to make sure modifiedNI is properly cleaned up when you delete node
  • Some test case which involves listing stuff in presence of non-empty addedNI, modifiedNI and deletedNI and making sure it all works together

Copy link
Contributor Author

@aleksandra-malinowska aleksandra-malinowska Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some test cases, translating this into list so we can follow up:

  • Add pods, fork, add pods (current test cases don't mix pods added in fork and in base)
  • Add node with pods in fork (basic use-case in CA)
  • Remove nodes in fork
  • Add / remove pods outside of fork
  • Testing HavePodsWithAffinityList()
  • Including verifying that cache is correctly invalidated when needed
  • Test for Clear() (including when forked)
  • Maybe a basic test for FilteredList()
  • Modify NI and later delete it (ex. remove pod / delete node) to make sure modifiedNI is properly cleaned up when you delete node
  • Some test case which involves listing stuff in presence of non-empty addedNI, modifiedNI and deletedNI and making sure it all works together

var nodeInfoList []*schedulernodeinfo.NodeInfo

if len(data.deletedNodeInfos) > 0 {
nodeInfoList = make([]*schedulernodeinfo.NodeInfo, 0, totalLen+100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping.
Also: If I'm guessing right this feels premature - amortized cost of add should hopefully already not be much in general case. Especially in this particular case, as any cluster likely to have 100 nodes expansion option will already have totalLen large enough that at most you should see a single re-alloc. Have you actually seen any impact of this (I don't mind keeping it as memory cost is negligible, I'm just curious)?

return data.addNodeInfo(nodeInfo)
}

func (data *internalDeltaSnapshotData) addNodeInfo(nodeInfo *schedulernodeinfo.NodeInfo) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NodeInfo may contain pods (you're calling it on modified nodes in commit() and those will generally have pods). Seems like you need to clear pod caches in this case.

I don't mind having this assume no pods (and hence no cache invalidation), but in that case please state that in comment and make sure to explicitly call the cache in places where it's needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in addNodeInfo

data.removeNode(node.Node().Name)
}
if _, found := data.modifiedNodeInfoMap[node.Node().Name]; found {
data.removeNode(node.Node().Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this node exists in base (which it does if we reach this call), you'll add it to removedNI inside removeNode(). In follow-up addNodeInfo you will add it back to modifiedNIMap, but you won't clean it from removedNI leading to inconsistent internal state.
Incidentally you only ever call updateNode() on base data in commit(). You're probably better of getting rid of this method altogether, rather than trying to make it correct.

}

dni.AddPod(pod)
if data.podList != nil || data.havePodsWithAffinity != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this if? Calling clearPodCaches if this is if not true does nothing and I don't believe it carries any cost.

dni, found := data.getNodeInfoLocal(nodeName)
if !found {
bni, found := data.baseData.getNodeInfo(nodeName)
if !found {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't cover a node that was deleted in fork. You can add pod to it, which will implicitly un-delete it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

dni, found := data.getNodeInfoLocal(nodeName)
if !found {
bni, found := data.baseData.getNodeInfo(nodeName)
if !found {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not safe to node-delete-in-fork


// Maybe consider deleting from the list in the future. Maybe not.
postAffinityPods := len(dni.PodsWithAffinity())
if preAffinityPods == 1 && postAffinityPods == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need this logic at all. I question how likely we're to call removePod() while having cached havePodsWithAffinity, but not podList.

return forkedData
}

func (data *internalDeltaSnapshotData) commit() *internalDeltaSnapshotData {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way more tricky than it seems once you start following the implementation of the helper methods. I would very much like to see a test that does all sorts of modifications on fork (add/delete pod on pre-existing nodes, add/delete pod on pre-existing node and later delete this node) and verifies that everything is applied correctly on commit.

}
for node := range data.deletedNodeInfos {
data.baseData.removeNode(node)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalidate caches on baseData

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having second thoughts about it :s If we're using the same methods as for other updates, we should be able to trust those methods to invalidate caches if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yet I don't :) As per my earlier comment addNodeInfo must either drop pod caches or explicitly assume nodeInfo it adds have no pods (in which case you should add a comment stating it and drop pod caches here).

Problematic sequence (maybe add some list calls to tests you already have to make them verify cache invalidation?):

snapshot.Pods().List()
snapshot.Fork()
snaphost.AddNode(node)
snapshot.AddPod(pod, node.Name())
snapshot.Commit()
snapshot.Pods().List() # my theory is that this will miss pod we added in fork

assert.NoError(t, err)

err = snapshot.RemoveNode("node")
assert.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: assert empty NodeInfos().List()? No reason not to check it.

nodeCount: 15000,
},
}
for _, modifiedPodCount := range []int{0, 1, 100} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the pods aren't really modified; they're just added in fork

return snapshot
}

func TestForking(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I think this is a nice test setup that allows to easily cover a lot of cases.

}

func TestClear(t *testing.T) {
// Run with -count=1 to avoid caching.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that how we run tests normally?


func TestClear(t *testing.T) {
// Run with -count=1 to avoid caching.
localRand := rand.New(rand.NewSource(time.Now().Unix()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about randomized ut, but in this case I can't name a good argument against it (you log all randomly generated numbers, so failures are easily reproducible). So I guess it's fine.

@MaciekPytel
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 14, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MaciekPytel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 14, 2020
@k8s-ci-robot k8s-ci-robot merged commit af1dd84 into kubernetes:master Feb 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants