-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Complimentary gluon DataLoader improvements #13606
Complimentary gluon DataLoader improvements #13606
Conversation
@mxnet-label-bot add[Data-loading, pr-awaiting-review] |
src/initialize.cc
Outdated
dmlc::SetEnv("OMP_NUM_THREADS", 1); | ||
// Conservative thread management for multiprocess workers | ||
const size_t mp_worker_threads = dmlc::GetEnv("MXNET_MP_WORKER_NTHREADS", 1); | ||
const size_t mp_omp_threads = dmlc::GetEnv("MXNET_MP_OMP_NUM_THREADS", 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libgomp doesnt play well with fork: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52303 . Which means user using libgomp and increasing the num threads can run into a deadlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, do you suggest completely turn off omp in worker process regardless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we should stick with the existing behavior which is turn off omp in worker regardless.
The alternative is to enforce MXNet users to use llvm openmp but that is a bigger discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anirudh2290 Did we really encounter this kind of deadlock? Any case to analyze?
We are looking at this issue and only using 1 thread in the worker is not very efficient.
@zhreshold Thank you! When is it going to be merged? |
@szha @eric-haibin-lin for review/merge |
* init * add tests * doc * lint * fix openmp
Is it included the |
@YutingZhang you can check the latest published file here https://pypi.org/project/mxnet-cu90mkl/#history If the build date is after the merge date, it is included |
@leezu Yes. I had been using it over the weekend :) |
* init * add tests * doc * lint * fix openmp
Description
This PR provides complimentary improvements for user to customize DataLoader in various use cases for optimal performance.
ThreadPool
version when shared memory is not prefered, or GIL is not a problem. Ref discussion in Rewrite dataloader, improves responsiveness and reliability #13447Checklist
Essentials
Please feel free to remove inapplicable items for your PR.