-
-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r.patch: Fix gathering statistics with multi-threading #2906
Conversation
I think this race condition was introduced by parallelism, but I haven't tried it yet. Looking at the code, have you tried: #pragma omp parallel private(i, row) reduction(+:computed) if (nprocs > 1) in line 208 in main.c? |
Please see the original problem in #2410 and the issue #2411 this PR is addressing. What you suggest may be a good improvement, but AFAIU it's not really related to this PR. We had an initial conversation about this on gitter. |
Oh, it was a different problem! Sorry for the noise. |
@petrasovaa First, the Second, I tried printing out the nodes of the *node for each statf object. Running nprocs=1 and nprocs=2 produce the right number of nodes and Would you be able to take a look? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, I think this is close to finished! Please cleanup the code (address the issues, remove comments, old statf, print functions). The new code for merging should be ideally in a separate function (in support.c).
Before merging this, we should also write test and run benchmark to ensure this is correct and to see what impact this has on performance. But at this point I suggest you to finish the cleanup and then we brainstorm (on Gitter) the GSoC project and proposal if you are interested, the deadline is Apr 4th. This PR could be finalized as part of the project.
@petrasovaa
Please let me know if there's anything else I should add to this PR. Thanks! |
Looking at the code of the benchmark, there is a fallback if r.surf.fractal fails that runs r.random.surface instead, did the fallback work for you? I think the fallback is there because r.surf.fractal might require lot of memory, is it possible you are running out of memory? Try to lower the size or raster and see if that helps. |
@petrasovaa My laptop only has 4 threads, so should I ignore the time increase as threads increase? My best guess is that the program does not know how to assign 12 threads to 4 threads, for example. Other than that, this graph shows that the OpenMP does make a difference, right? As a side note, running the benchmarks takes around 30-40 minutes for each run on my laptop. Is there a way to reduce the number of threads being tested so only nprocs=1,2,4 are run? Thanks! |
FWIW, you can use a swap file (!) to "enlarge" the RAM if you have some spare GB of disk space:
|
Good to know, but I wonder if this affects the performance. |
I will try to run the benchmark myself, I have plenty of cores and RAM, so at this point don't worry about this. But it's good you were able to test this! |
I'd assume a performance hit when using swap instead of RAM of course, thus the results you get shouldn't be considered "performance" metrics, but at least you are able to run it. The performance of the algorithm you are tested will depend on what other tasks are running on the system (and their memory usage, if they swap), and the disk I/O performance of your system. If a final performance chart is wanted, maybe get someone with a better suited system to run the tests. |
Sounds good. For now I will plan to ask others to help out with the testing. One other thing I noticed was that the GitHub check is currently failing for Ubuntu on the test for
However, the Thanks! |
through the Yes, that's unrelated, sometimes it happens we are unsure why, usually we would rerun the workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the changes have been resolved and I believe the PR is ready for a final review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the files have been reviewed, but I'm not sure how to get rid of the "merging block"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good, but I still need to run the benchmarks (I added tasks in the PR description), will get to it hopefully next week. You don't need to do anything right now.
Sounds good. Thank you so much for all your help! |
@petrasovaa |
I think it's all fine, ready to be merged. The failing mac test is unrelated. But could you please look at Huidae's suggestion above and see if that works and if it improves anything? If yes, that would be a different PR. |
@HuidaeCho @petrasovaa |
Yes, please create a new PR and I will look at it. I will merge this one after the checks are done. |
I've opened a new pull request for the reduction (#2941). However, it looks like all of my changes from this PR are still on the new PR. Would this potentially become an issue later on? |
Right, this needs to be fixed. I am not sure what is the easiest way, you can update your main, create a new branch, put your changes there and the force push (something like |
Fixes OSGeo#2411. Adds tests.
Fixes #2411. Adds tests.
Original description:
This is my current draft implementation for making an array of statf structures for each thread.
Null count is being counted correctly in each thread and the values add up to what is expected, but I am unsure how to proceed for merging the trees.
Before merging: