-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrency/mutexmap Move Unlock to after operation #101
Conversation
Signed-off-by: joshvanl <me@joshvanl.dev>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide more context of why this change is needed? The previous design separated the lock of the map with the lock of each mutex.
Can we have a unit test that runs multiple operations in parallel to try a deadlock scenario? I don't disagree that there is a bug but I am concerned about this PR as well for bundling the 2 locks instead of keeping them separate. |
LGTM, but I agree we should extend the existing concurrency tests with the delete unlocks. |
Because it is a true race condition of LOC and execution, I'm not sure of a way to write a sensible unit test for it. |
It is hard to make a deterministic test for this. Is it possible to make a test that will not cause false positives? Meaning that the race condition will most likely cause the test to fail but not 100% guaranteed. On the other hand, not having the race condition will make the test pass 100%. This way, we can run the test a few times to make sure. It is not ideal but better than visual inspection (aka code review) IMO. Is there another layer (runtime, maybe) where it can be tested? |
I can look into doing that. This bug is currently being manifested in Dapr integration tests failing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved under the assumption that integration tests will show the fix.
Here is an example of a dapr/dapr int test surfacing a variation of this bug /~https://github.com/dapr/dapr/actions/runs/10780003020/job/29895148587?pr=8066 |
Description
Please explain the changes you've made
Issue reference
We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.
Please reference the issue this PR will close: #[issue number]
Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list: