-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug where race condition results in incorrect fields categorization when computing particle_trajectories #4802
Conversation
…ded(?). ds_first is not (since we call all_data() on it). That was causing this if statement to fail.
The failing tests appear to be unrelated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your fix is absolutely reasonable but I don't really understand in what situation this race condition could happen ? I don't know that we enable any parallelism in tests, which may be why we never caught this in CI ?
I honestly don't know why it's happening either. All I know is my processing chain broke when I switched from a single cpu system to a multicpu system (no other changes, I didn't even explicitly allow parallelism) and diving into the code showed I'll try to look into it more over the next couple days, as I have time. If it helps, the timeseries was composed of 4 adjacent GIZMO runs with ~ 10^5 particles each and essentially all I was doing was loading the files into a |
That's weird ! anyhow I don't think we should require an in depth explanation to merge this (but it still would be nice to have). I'll leave it open in case any one wants to join the discussion but we'll want to merge it in time for yt 4.3.1 |
…incorrect fields categorization when computing particle_trajectories
…2-on-yt-4.3.x Backport PR #4802 on branch yt-4.3.x (Fix bug where race condition results in incorrect fields categorization when computing particle_trajectories)
…ition results in incorrect fields categorization when computing particle_trajectories)"
In the
_get_data
method ofparticle_trajectories
,self.data_series[0]
might still be unloaded (not sure if that's the right word?) when running in parallel. That was causing theif
statement that checks if the particle_type of missing field exists to fail, incorrectly adding fields to thegrid_fields
list. Sinceds_first
is loaded (because we called all_data() on it earlier), this is now more likely to succeed.Essentially
In my case,
self.data_series[0].particle_types
didn't even have theall
particle_type, even thoughds_first
hadall
and several others. Let me know if you need more information.PR Summary
PR Checklist
Unfortunately, I don't know why this wasn't picked up in the tests, I would expect it to be picked up by
test_default_field_tuple
.