Re-design equivalence class cache to two level cache #65714

resouer · 2018-07-02T14:27:00Z

What this PR does / why we need it:

The current ecache introduced a global lock across all the nodes, and this patch tried to assign ecache per node to eliminate that global lock. The improvement of scheduling performance and throughput are both significant.

CPU Profile Result

Machine: 32-core 60GB GCE VM

1k nodes 10k pods bench test (we've highlighted the critical function):

Current default scheduler with ecache enabled:
Current default scheduler with ecache disabled:
Current default scheduler with this patch and ecache enabled:

Throughput Test Result

1k nodes 3k pods scheduler_perf test:

Current default scheduler, ecache is disabled:

Minimal observed throughput for 3k pod test: 200
PASS
ok      k8s.io/kubernetes/test/integration/scheduler_perf    30.091s

With this patch, ecache is enabled:

Minimal observed throughput for 3k pod test: 556
PASS
ok      k8s.io/kubernetes/test/integration/scheduler_perf    11.119s

Design and implementation:

The idea is: we re-designed ecache into a "two level cache".

The first level cache holds the global lock across nodes and sync is needed only when node is added or deleted, which is of much lower frequency.

The second level cache is assigned per node and its lock is restricted to per node level, thus there's no need to bother the global lock during whole predicate process cycle. For more detail, please check the original discussion.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #63784

Special notes for your reviewer:

~~Tagged as WIP to make sure this does not break existing code and tests, we can start review after CI is happy.~~

Release note:

Re-design equivalence class cache to two level cache

resouer · 2018-07-03T07:41:34Z

pkg/scheduler/cache/cache.go

@@ -51,7 +51,7 @@ type schedulerCache struct {
 	period time.Duration

 	// This mutex guards all fields within this cache struct.
-	mu sync.Mutex
+	mu sync.RWMutex


Comment for reviewer: this change is to improve cache.IsUpToDate

Did we forget to approve/merge the PR where someone else made this change? /=

Nope, that previous PR only improved ecache, not cache :D

So, the way I see it, we are making 2 changes as part of this PR,

Changing scheduler cache's lock from Mutex to RWMutex

Changing ecache to be 2 level cache.

So, the flame graphs attached, could be result of both the changes? Am I missing something?

The scheduler cache refactoring is just a bonus :D Performance improvement is brought by 2 level cache which is the core idea of this patch.

resouer · 2018-07-03T13:37:14Z

cc @bsalamat @misterikkit Seems CI is happy now, please take a look :D

misterikkit

If I understand correctly, we want to avoid top level RLock() multiple times when running various predicates for one node. If that's the case, I think we should structure the equivalence pkg API around that. (which is basically what you've done) I just think the names of things could better reflect how we expect users to use them. For example, how about this?

eCache := equivalence.NewCache()
class := equivalence.NewClass(pod)
nodeCache := eCache.ForNode(nodeName)
fit, reasons, err := nodeCache.RunPredicate(p, pod, class, etc...)

The type NodeCache would have only one exported function and no exported fields (maybe it's an interface).

AddNode is done lazily by ForNode.
RemoveNode could keep it's old name of InvalidateAllPredicatesOnNode.

misterikkit · 2018-07-03T16:41:12Z

pkg/scheduler/core/equivalence/eqivalence.go

+// TopLevelEquivCache is a thread safe nodeMap
+// NOTE(harry): Theoretically sync.Map has better performance in machine with 8+ CPUs, while
+// the reality is lock contention in first level cache is rare.
+type TopLevelEquivCache struct {


The exported symbols from this package should be "Cache" and "NewCache". The top level struct/constructor should take those names, while we rename the old ones.

Side note: To avoid "stutter" we should not put "equiv" or "equivalence" in exported symbols of this package.

And NodeCache, we need to pass this type as parameter in generic_scheduler

misterikkit · 2018-07-03T16:42:30Z

pkg/scheduler/core/equivalence/eqivalence.go

@@ -33,6 +33,152 @@ import (
 	"github.com/golang/glog"
 )

+// nodeMap is type for a map of {nodeName: *Cache}


This comment is mostly redundant. We know it's a type, a map, and that the value is *Cache. How about something like, "nodeMap stores a *Cache for each node."

Fixed, thanks!

misterikkit · 2018-07-03T16:46:38Z

pkg/scheduler/core/equivalence/eqivalence.go

+// GetOrCreateCache returns the second level Cache for given node name.
+// If the cache does not exist, create it.
+// This should be called in generic scheduler before predicate phase begins for given node.
+func (n *TopLevelEquivCache) GetOrCreateCache(name string) *Cache {


Does the API surface of this package need to change for this improvement? RunPredicate had the side effect of creating new cache items if needed.

I'm ok with improvements to the eCache API, but this seems like more management work will be put on generic scheduler.

As RunPredicate now only handled by NodeCache, the first level cache has to be managed by generic scheduler.

misterikkit · 2018-07-03T16:55:03Z

pkg/scheduler/core/equivalence/eqivalence.go

@@ -41,14 +187,14 @@ import (
 // class". (Equivalence class is defined in the `Class` type.) Saved results
 // will be reused until an appropriate invalidation function is called.
 type Cache struct {
-	mu    sync.RWMutex
-	cache nodeMap
+	sync.RWMutex


Why remove the 'mu' here? By embedding the type, ppl outside this package can start calling (*Cache).Lock(), which I don't think is intended.

misterikkit · 2018-07-03T16:55:47Z

pkg/scheduler/core/generic_scheduler.go

@@ -355,21 +355,27 @@ func (g *genericScheduler) findNodesThatFit(pod *v1.Pod, nodes []*v1.Node) ([]*v
 		// We can use the same metadata producer for all nodes.
 		meta := g.predicateMetaProducer(pod, g.cachedNodeInfoMap)

-		var equivClass *equivalence.Class
+		var (


single var can be on single line.

Done! Also, refactored functions to new names :-) Please review.

misterikkit

This looks great! I left mostly style/naming comments, and they don't all need to be addressed. There are a couple API/threading questions though.

misterikkit · 2018-07-09T17:30:59Z

pkg/scheduler/cache/cache.go

@@ -51,7 +51,7 @@ type schedulerCache struct {
 	period time.Duration

 	// This mutex guards all fields within this cache struct.
-	mu sync.Mutex
+	mu sync.RWMutex


Did we forget to approve/merge the PR where someone else made this change? /=

misterikkit · 2018-07-09T17:37:56Z

pkg/scheduler/core/equivalence/eqivalence.go

+// nodeMap stores a *Cache for each node.
+type nodeMap map[string]*NodeCache
+
+// Cache is a thread safe nodeMap


Because the Cache type is the main entry point of this package, I think we should keep a lot of the previous type comment with it. i.e. the fact that it saves and reuses predicate results, and the explanation of node/predicate/class heirarchy. (Those comments were moved to the NodeCache type, but I think it's okay if we repeat ourselves a bit.)

Also, this is the second significant API change I've seen that is driven by performance needs. I think we should add to the comment on this type that users are expected to precompute the equivalence.Class value and pre-lookup the NodeCache within one scheduling cycle.

Good point, I will clarify these with more comments.

misterikkit · 2018-07-09T17:38:27Z

pkg/scheduler/core/equivalence/eqivalence.go

+// ForNode returns the existing NodeCache for given node if present. Otherwise,
+// it creates the NodeCache and returns it.
+// The boolean flag is true if the value was loaded, false if created.
+func (n *Cache) ForNode(name string) (nodeCache *NodeCache, exists bool) {


It's weird for the receiver variable to be n. Could you change it to c?

misterikkit · 2018-07-09T17:39:08Z

pkg/scheduler/core/equivalence/eqivalence.go

+}
+
+// removeCachedPreds deletes cached predicates by given keys.
+func (c *NodeCache) removeCachedPreds(predicateKeys sets.String) {


Should this just be called invalidatePreds?

misterikkit · 2018-07-09T17:39:54Z

pkg/scheduler/core/equivalence/eqivalence.go

+}
+
+// removeCachedPreds deletes cached predicates by given keys.
+func (c *NodeCache) removeCachedPreds(predicateKeys sets.String) {


Probably we should move this function down to the other NodeCache receivers.

misterikkit · 2018-07-09T17:40:34Z

pkg/scheduler/core/equivalence/eqivalence.go

-	return &Cache{
-		cache: make(nodeMap),
+// newCache returns an empty NodeCache.
+func newCache() *NodeCache {


Let's call this newNodeCache to avoid confusion with NewCache

misterikkit · 2018-07-09T17:41:16Z

pkg/scheduler/core/equivalence/eqivalence.go

@@ -97,7 +218,7 @@ type predicateResult struct {
 // run and its results cached for the next call.
 //
 // NOTE: RunPredicate will not update the equivalence cache if the given NodeInfo is stale.
-func (c *Cache) RunPredicate(
+func (c *NodeCache) RunPredicate(


Similar to above, receiver variable should probably change from c to n.

Fixed, thanks for catch this nits.

misterikkit · 2018-07-09T17:42:34Z

pkg/scheduler/core/equivalence/eqivalence.go

@@ -126,16 +247,15 @@ func (c *Cache) RunPredicate(
 }

 // updateResult updates the cached result of a predicate.
-func (c *Cache) updateResult(
+// This function is thread safe for second level Cache, no need to sync with top level cache.


All operations on NodeCache objects are threadsafe in the context of that NodeCache object, right? It might be worth mentioning that on the type's comment.

misterikkit · 2018-07-09T17:45:47Z

pkg/scheduler/core/equivalence/eqivalence.go

+// ForNode returns the existing NodeCache for given node if present. Otherwise,
+// it creates the NodeCache and returns it.
+// The boolean flag is true if the value was loaded, false if created.
+func (n *Cache) ForNode(name string) (nodeCache *NodeCache, exists bool) {


What should callers do differently depending on the boolean return?

I would like to expose a indication of the change we did in ForNode() call as it has mixed duties. This is only used by tests for now.

resouer · 2018-07-10T14:32:12Z

/test pull-kubernetes-e2e-kops-aws

misterikkit

/lgtm

(I still don't have approver powers, btw)

misterikkit · 2018-07-10T17:38:23Z

pkg/scheduler/core/equivalence/eqivalence.go

 }

-// NewCache returns an empty Cache.
+// NewCache create am empty equiv class cache.


s/create am/creates an/

misterikkit · 2018-07-10T17:39:38Z

pkg/scheduler/core/equivalence/eqivalence.go

+		c.nodeToCache[name] = newNodeCache()
+		nodeCache = c.nodeToCache[name]
+	}
+	return


I'm still new to golang - what's the general style for naked returns? Should we be redundant and say return nodeCache, exists here?

I believe this is one of benefits of named return value, i.e. no need to refactor the return line in the future change.

misterikkit · 2018-07-10T17:40:43Z

pkg/scheduler/core/equivalence/eqivalence.go

+
+// InvalidateCachedPredicateItemForPodAdd is a wrapper of
+// InvalidateCachedPredicateItem for pod add case
+// TODO: This does not belong with the equivalence cache implementatioc.


s/implementatioc/implementation/

misterikkit · 2018-07-10T17:41:42Z

pkg/scheduler/core/equivalence/eqivalence.go

-			}
-		}
+// predKeySet returns all cached predicate keys.
+func (n *NodeCache) predKeySet() sets.String {


Is this function used anywhere? Can we delete it?

Oh, a legacy one.

misterikkit · 2018-07-10T17:46:02Z

pkg/scheduler/core/generic_scheduler_test.go

 	mockCache := &syncingMockCache{
 		Cache:            cache,
 		cycleStart:       make(chan struct{}),
 		cacheInvalidated: make(chan struct{}),
 	}

 	eCache := equivalence.NewCache()
+	// ForNode will lazily create NodeCache for given test node.


Is this needed as part of this test?

Oh, no need.

ravisantoshgudimetla · 2018-07-11T11:52:49Z

pkg/scheduler/core/equivalence/eqivalence.go

+// it creates the NodeCache and returns it.
+// The boolean flag is true if the value was loaded, false if created.
+func (c *Cache) ForNode(name string) (nodeCache *NodeCache, exists bool) {
+	c.mu.RLock()


Shouldn't we do c.mu.Lock() and c.mu.UnLock() as we are adding a new entry in case we don't have one?

ravisantoshgudimetla · 2018-07-11T11:54:23Z

pkg/scheduler/core/equivalence/eqivalence.go

+}
+
+// InvalidatePredicatesOnNode clears cached results for the given predicates on one node.
+func (c *Cache) InvalidatePredicatesOnNode(nodeName string, predicateKeys sets.String) {


You can probably combine this function and above one.

I will choose to keep them as separate calls are needed.

ravisantoshgudimetla

@resouer Thanks. The changes LTM. I am just curious about RWMutex scale issue. Can you try couple of things

Schedule pods on a bigger node where GO_MAX_PROCS is higher(>8).
Schedule more pods lon more nodes ike 30k and 5k nodes.

I want to know, if we are going to hit any of the scaling issues mentioned for RWlock.

k8s-ci-robot · 2018-07-19T21:15:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, misterikkit, ravisantoshgudimetla, resouer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/scheduler/OWNERS~~ [bsalamat]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2018-07-19T22:20:18Z

@resouer: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
pull-kubernetes-e2e-gce-device-plugin-gpu	`e5a7a4c`	link	`/test pull-kubernetes-e2e-gce-device-plugin-gpu`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-github-robot · 2018-07-19T23:16:02Z

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

resouer · 2018-07-20T02:41:18Z

Thanks Bobby, I am keeping eyes on tests grid.

resouer · 2018-07-25T02:48:03Z

cc @wojtek-t you maybe interested in this fix as well. The integration test is still happy by now.

ref: #58222

bsalamat · 2018-07-25T18:44:29Z

@resouer I checkout the master today and made sure that your changes are in, but when I run scheduler benchmarks for 5000 nodes, scheduling 1000 pods, I still see a performance degradation when I enable equivalence cache. 20ms/pod when ecache is enabled vs. 18ms/pod when ecache is disabled. I enable/disable eCache here. Am I missing anything in running the test?

eCache disabled:

pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkScheduling/5000Nodes/1000Pods-12         	    1000	  18598875 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	41.281s
+++ [0725 11:41:51] Cleaning up etcd
+++ [0725 11:41:51] Integration test cleanup complete

eCache enabled:

pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkScheduling/5000Nodes/1000Pods-12         	    1000	  20127793 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	43.632s

resouer · 2018-07-26T16:25:56Z

The test method is right, while I just tested with HEAD@d2fc875489, it seems work well:

{nodes: 5000, existingPods: 0, minPods: 10000},

make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_ARGS="-run=xxx -bench=.*BenchmarkScheduling.* -cpuprofile cpu-5000n-10000p.out"

ecache=false: 
BenchmarkScheduling/5000Nodes/0Pods-64             10000          50404244 ns/op
BenchmarkSchedulingAntiAffinity/500Nodes/5000Pods-64        	     250	  18695074 ns/o

ecache=true:
BenchmarkScheduling/5000Nodes/0Pods-64             10000          15933557 ns/op
BenchmarkSchedulingAntiAffinity/500Nodes/5000Pods-64                 250           7225648 ns/op

It's wired. I'm looking into it.

bsalamat · 2018-07-26T17:58:59Z

Hmmm. I don't know what is going on. I checked out the head again and used your exact configutation: {nodes: 5000, existingPods: 0, minPods: 10000}. Still the results are in favor of disabled ecache.

Just to make sure we are using similar config, I use the following command line to run the benchmarks:

make test-integration WHAT=./test/integration/scheduler_perf GOFLAGS="-v=1" KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-run=xxx -bench BenchmarkScheduling/5000Nodes/0Pods"

Cache enabled:

goos: linux
goarch: amd64
pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkScheduling/5000Nodes/0Pods-12         	   10000	  10227194 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	116.835s

cache disabled:

goos: linux
goarch: amd64
pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkScheduling/5000Nodes/0Pods-12         	   10000	   8873203 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	103.355s

resouer · 2018-07-27T00:26:10Z

Emm, when I rebased to HEAD@e4465b6e2f (should be latest commit of yesterday), I can see your problem happens:

ecache=false: 
BenchmarkScheduling/5000Nodes/0Pods-64             10000          16414234 ns/op

ecache=true:
BenchmarkScheduling/5000Nodes/0Pods-64             10000          15934557 ns/op

Seems like some recent commits "improved" default scheduler (or bench testing at least) somehow, it's interesting. I am taking look at this.

[Update] After bi-search the commits, it seems related to: 0bf7427 of #66061

// before  0bf7427
ecache=false: 
BenchmarkScheduling/5000Nodes/0Pods-64             10000          57223416 ns/op

// after (with)  0bf7427
ecache=false: 
BenchmarkScheduling/5000Nodes/0Pods-64             10000          16414234 ns/op

resouer · 2018-07-29T16:24:37Z

@bsalamat , a quick follow up is, with test command below, everything works as expected in my env:

export KUBE_TEST_ARGS="-run=xxx -bench=BenchmarkScheduling -cpuprofile cpu-5000n-10000p.out"
make test-integration WHAT=./test/integration/scheduler_perf

But with test command like this, benchmark test will finish much earlier (which never happen before #66061):

make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_ARGS="-run=xxx -bench=BenchmarkScheduling -cpuprofile cpu-5000n-10000p.out"

The difference seems related to env passing across multiple .sh files

[Update]: I've reported the issue and this inconsistency will be solved by #66816

misterikkit · 2018-07-30T16:53:02Z

Maybe add "-count 1" just to be paranoid?

resouer · 2018-08-01T13:56:35Z

@misterikkit @bsalamat Just fired #66862 and #66816 to partially fix this wired case.

While the root cause I recently observed is: depending on KUBE_TEST_VMODULE="''" is set or not, the test result will be hugely differentiated. i.e. with -vmodule='' the benchmark test will finish much earlier than expected.

bsalamat · 2018-08-02T18:03:16Z

Thanks, @resouer! As you have noticed, vmodule skews the results significantly. It must be disabled for performance benchmarking, otherwise vmodule takes most of the time. "vmodule" is disabled in our production code, but it is enabled by default when running integration tests, including our benchmarks.

I used the following command to run the benchmarks while vmodule is disabled:

KUBE_TEST_ARGS="-run=xxx -bench=BenchmarkScheduling/5000Nodes/0Pods" GOFLAGS="-v=1" KUBE_TEST_VMODULE="''" make test-integration WHAT=./test/integration/scheduler_perf

I still see a slight performance decrease when ecache is enabled. We should think what we kind of changes we should make to improve eCache's performance. CPU profiling may reveal areas of potential improvement.

Cache disabled

goos: linux
goarch: amd64
pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkScheduling/5000Nodes/0Pods-12         	   10000	   9401233 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	108.352s

Cache enabled:

pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkScheduling/5000Nodes/0Pods-12         	   10000	  10969937 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	124.011s

bsalamat · 2018-08-02T23:54:44Z

I have some good news. The above results are obtained with regular Scheduler benchmark. That benchmark tries scheduling a number of simple pods. The only scheduling requirement of those pods is the amount of resources available on nodes. Checking amount of resources is a simple mathematical operation. That's why we don't see performance improvement when we enable eCache. I ran the AntiAffinity benchmarks with and without eCache. eCache improves performance by over 16x!

These results indicate that eCache is beneficial only for more sophisticated predicates which are computationally expensive. Examples are inter-pod affinity and anti-affinity predicates. What I think we should do is to find a list of predicates which are more computationally expensive and use eCache only for those, instead of using the eCache for all the predicates.

Cache disabled

goos: linux
goarch: amd64
pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkSchedulingAntiAffinity/5000Nodes/0Pods-12         	    2000	 287683998 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	589.456s

Cache enabled

goos: linux
goarch: amd64
pkg: k8s.io/kubernetes/test/integration/scheduler_perf
BenchmarkSchedulingAntiAffinity/5000Nodes/0Pods-12         	    2000	  17425632 ns/op
PASS
ok  	k8s.io/kubernetes/test/integration/scheduler_perf	48.989s

resouer · 2018-08-03T02:40:12Z

AntiAffinity benchmarks with and without eCache. eCache improves performance by over 16x!

@bsalamat, Yes, I am aware that the current benchmark test we are playing is too "simple" :D

And another thing I'd like to amend is, in current logic, GenericPredicates are frequently invalidated as a whole by new Pod added event. This should explains why the tests results are close.

We should think what we kind of changes we should make to improve eCache's performance

Actually, after #66862 is in, the only impact of ecache is map accesses in lookupResult(), unfortunately, according to CPU profiling, I am pretty sure map access is not O(1) in Golang, and just as you mentioned, it's slower than simple mathematical operation.

What I think we should do is to find a list of predicates which are more computationally expensive and use eCache only for those, instead of using the eCache for all the predicates.

This is exactly what's in my mind. Thanks! Will continue on this direction, while instead of bypass specific predicates, I would prefer if we can measure them in the future.

And I was also hoping to invalidate PodFitsResources separately or skip it during ecache, it will be a noticeable improvement for the ecache hit rate.

I have met the following criteria. - member for at least 3 months - primary reviewer for at least 5 PRs - kubernetes#63603 - kubernetes#63665 (and related PRs) - kubernetes#63839 - kubernetes#65714 - kubernetes#66862 - reviewed or merged at least 20 PRs reviewed 13: /~https://github.com/pulls?utf8=%E2%9C%93&q=is%3Apr+archived%3Afalse+is%3Amerged+repo%3Akubernetes%2Fkubernetes+commenter%3Amisterikkit+in%3Acomment+assignee%3Amisterikkit+ merged 22: /~https://github.com/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Amisterikkit+archived%3Afalse+is%3Amerged+repo%3Akubernetes%2Fkubernetes+

@bsalamat

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="/~https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Add misterikkit to sig-scheduling REVIEWERS. I have met the following criteria. - member for at least 3 months - primary reviewer for at least 5 PRs - #63603 - #63665 (and related PRs) - #63839 - #65714 - #66862 - reviewed or merged at least 20 PRs reviewed 13: /~https://github.com/pulls?utf8=%E2%9C%93&q=is%3Apr+archived%3Afalse+is%3Amerged+repo%3Akubernetes%2Fkubernetes+commenter%3Amisterikkit+in%3Acomment+assignee%3Amisterikkit+ merged 22: /~https://github.com/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Amisterikkit+archived%3Afalse+is%3Amerged+repo%3Akubernetes%2Fkubernetes+ **Release note**: ```release-note NONE ``` /cc @bsalamat

k8s-ci-robot requested review from davidopp and ravisantoshgudimetla July 2, 2018 14:27

k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Jul 2, 2018

resouer force-pushed the fix-63784 branch 2 times, most recently from ec39e38 to 014d756 Compare July 3, 2018 07:40

resouer commented Jul 3, 2018

View reviewed changes

resouer force-pushed the fix-63784 branch from 014d756 to 9c69575 Compare July 3, 2018 07:44

resouer changed the title ~~[WIP] Re-design equivalence class cache to two level cache~~ Re-design equivalence class cache to two level cache Jul 3, 2018

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 3, 2018

misterikkit suggested changes Jul 3, 2018

View reviewed changes

resouer force-pushed the fix-63784 branch 2 times, most recently from e36cdd4 to f505798 Compare July 5, 2018 16:00

misterikkit reviewed Jul 9, 2018

View reviewed changes

resouer force-pushed the fix-63784 branch from f505798 to c428bf6 Compare July 10, 2018 14:06

misterikkit reviewed Jul 10, 2018

View reviewed changes

k8s-ci-robot assigned misterikkit Jul 10, 2018

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 10, 2018

resouer force-pushed the fix-63784 branch from c428bf6 to fe583e7 Compare July 11, 2018 03:16

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 11, 2018

ravisantoshgudimetla reviewed Jul 11, 2018

View reviewed changes

resouer force-pushed the fix-63784 branch 2 times, most recently from 1cb63e9 to 153045e Compare July 11, 2018 15:34

ravisantoshgudimetla reviewed Jul 11, 2018

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 19, 2018

k8s-github-robot merged commit 795b7da into kubernetes:master Jul 19, 2018

resouer mentioned this pull request Jul 20, 2018

When enable the equivalence cache feature which makes the pod scheduling performance decline #63784

Closed

resouer deleted the fix-63784 branch July 20, 2018 02:18

resouer mentioned this pull request Jul 25, 2018

Move equivalence class based scheduling to beta #58222

Closed

resouer mentioned this pull request Jul 30, 2018

KUBE_TEST_ARGS in make command did not pass to hack/make-rules/test.sh #66782

Closed

resouer mentioned this pull request Aug 1, 2018

Use sync.map to scale equiv class cache better #66862

Merged

resouer mentioned this pull request Aug 10, 2018

Fine grained control of equiv class cache based on predicate #67241

Closed

misterikkit mentioned this pull request Aug 21, 2018

Use monotonically increasing generation to prevent equivalence cache race #67308

Merged

misterikkit mentioned this pull request Aug 21, 2018

Add misterikkit to sig-scheduling REVIEWERS. #67681

Merged

bsalamat mentioned this pull request Nov 14, 2018

Remove equivalence cache from the scheduler code base #71013

Closed

lowang-bh mentioned this pull request May 25, 2024

to add a feasible nodes cache for same type of pods in same workload #124949

Closed

Re-design equivalence class cache to two level cache #65714

Re-design equivalence class cache to two level cache #65714

Conversation

resouer commented Jul 2, 2018 • edited Loading

resouer Jul 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

resouer Jul 11, 2018 • edited Loading

Choose a reason for hiding this comment

resouer commented Jul 3, 2018

misterikkit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

resouer Jul 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

misterikkit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

resouer Jul 10, 2018 • edited Loading

Choose a reason for hiding this comment

resouer commented Jul 10, 2018

misterikkit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravisantoshgudimetla left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jul 19, 2018

k8s-ci-robot commented Jul 19, 2018 • edited Loading

k8s-github-robot commented Jul 19, 2018

resouer commented Jul 20, 2018

resouer commented Jul 25, 2018

bsalamat commented Jul 25, 2018

resouer commented Jul 26, 2018 • edited Loading

bsalamat commented Jul 26, 2018

resouer commented Jul 27, 2018 • edited Loading

resouer commented Jul 29, 2018 • edited Loading

misterikkit commented Jul 30, 2018 via email

resouer commented Aug 1, 2018 • edited Loading

bsalamat commented Aug 2, 2018 • edited Loading

bsalamat commented Aug 2, 2018

resouer commented Aug 3, 2018 • edited Loading

resouer commented Jul 2, 2018 •

edited

Loading

resouer Jul 3, 2018 •

edited

Loading

resouer Jul 11, 2018 •

edited

Loading

resouer Jul 5, 2018 •

edited

Loading

resouer Jul 10, 2018 •

edited

Loading

k8s-ci-robot commented Jul 19, 2018 •

edited

Loading

resouer commented Jul 26, 2018 •

edited

Loading

resouer commented Jul 27, 2018 •

edited

Loading

resouer commented Jul 29, 2018 •

edited

Loading

resouer commented Aug 1, 2018 •

edited

Loading

bsalamat commented Aug 2, 2018 •

edited

Loading

resouer commented Aug 3, 2018 •

edited

Loading