-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent long delays in ExpirationManager.Stop due to blanking a large pending map #23282
Conversation
CI Results: |
… doesn't cause trouble, by making expireFunc a noop.
Build Results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good and seems safe based on the diff here. That said I don't have lots of context of this subsystem so there is a whole world of subtleties I may be missing. I'd be happy to approve though it probably makes more sense to test this out with the reproduction of bad behaviour we have in a test environment to confirm it has the performance properties we expect before a final pass/approval.
newStrategy := ExpireLeaseStrategy(expireNoop) | ||
m.expireFunc.Store(&newStrategy) | ||
oldPending := m.pending | ||
m.pending, m.nonexpiring, m.irrevocable = sync.Map{}, sync.Map{}, sync.Map{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes way more sense than iterating and deleting elements one by one!
// We need to stop the timers to prevent memory leaks. | ||
info.timer.Stop() | ||
return true | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My rationale for why this doesn't leak all these pendingInfos:
m.pending
is no longer pointing at this map- Assuming nothing else ever holds a pointer to
m.pending
without also holdingm.pendingLock
(but need to confirm that). - When this goroutine completes, nothing else in here will still be holding a reference to
oldPending
. - So the entire
oldPending
map and all thependingInfo
s should be GCed.
Did I miss anything? Did you already verify the assumption I called out Nick to your satisfaction? Happy to take a closer look if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading to the end, it seems the two places that do access this do (now) take the lock for the access but not for the iteration.
So it's possible that even after this goroutine goes away that one of the walk*
methods will still be iterating this pending map which is why you have the no-op expire func. Once they are done iterating though we should be OK to GC all the old maps and items. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you missed anything, thanks for verifying!
Co-authored-by: Paul Banks <pbanks@hashicorp.com>
No description provided.