-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@@ -34,4 +34,5 @@ then | |||
mkdir /work/mxnet | |||
mkdir /work/build | |||
chown -R jenkins_slave /work/ | |||
chown -R jenkins_slave /usr/local/bin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is hacky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the entire Docker container is owned by root by default. Since we run in low-privilege mode inside our container (opposed to running everything as root), we have to whitelist certain directories (/work and /usr/local/bin) that the process is allowed to write. This is basically a necessary evil of the least-priviledge-policy.
An alternative would be to set these softlinks directly during the ccache installation. The problem there would then be that CCache would always be activated and we would not be able to turn it on/off during runtime. I wouldn't mind about that, but I didn't want to change the behaviour. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What requires permissions to write in /usr/local/bin ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The symbolic link I'm creating. It's super hacky, but some buildsystems seem to directly access /usr/local/bin in order to access the compiler binary. They ignore any custom setting, so I'm using the super brutal way to just rewrite that entry by overriding it with a symlink.
This is especially the case for all the submodule dependencies we have. We can't control their compiler choices since they just hardcoded them. ps-lite and ZeroMQ do that, for example.
|
||
chmod +x cc | ||
chmod +x cxx | ||
# Add to the beginning of path to ensure this redirection is picked up instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this better than the previous solution? It looks less explicit and hacky. Can you provide some rationale?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the recommended way by CCache: https://ccache.samba.org/manual.html#_run_modes
This way ensures that ccache is always used if the symlinks are set. Otherwise, we'd have to make a possibility in all build-paths (including third party) that they use ccache properly. This caused problems in compiling ps-lite, because that project had another third party dependency (ZeroMQ) which did not pick up the propagated compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks hacky, what is the rationale of these changes?
The problem is that the current method with the ccache wrappers does not work for third party dependencies. There are some scripts (libtool which is part of ZeroMQ which is part of 3rdparty/ps-lite) which identify the compiler by its name. If we redirect the compiler using the wrapper method, the script gets confused and falls back into a backup mode, which isn't able to compile the sub-projects properly. Also, this is the recommended way by CCache: https://ccache.samba.org/manual.html#_run_modes |
I didn't know it's recommended by the authors. Thanks for the clarification. |
@@ -489,9 +464,8 @@ build_ubuntu_amalgamation_min() { | |||
build_ubuntu_gpu_cmake_mkldnn() { | |||
set -ex | |||
cd /work/build | |||
install_ccache_wrappers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should not be any need for wrappers for cmake based builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is that we rather intercept the backend calls to the compilers instead of defining them through the frontend (aka the Cmake parameters). This allows us to be flexible with whatever compilers we define without having to think about ccache because it's handled automatically. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As much as I know cmake will not like this and different problems can appear suddenly. Using cmake parameters is the recommended approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know and I only chose this approach after playing around a lot. The problem is that the underlying submodules we have don't respect that choice.
.gitignore
Outdated
cxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you moved out the wrappers then this few lines are not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
# of the original ones. Especially CUDA/NVCC appends itself to the beginning of the | ||
# path and thus this redirect is ignored. This change fixes this problem | ||
export PATH=/usr/local/bin:$PATH | ||
ln -s ccache /usr/local/bin/gcc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this work if already set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can define and configure the compilers ($CC, $CXX, $NVCC) as you want, we only intercept the backend calls to the actual compilers. This allows us to be independent of the configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant if ccache soft link is already there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, the softlink (or whatever was there previously) would just get replaced.
@marcoabreu are you going address the comments, if its handled in a different PR, could you please close this? |
This PR is still WIP and paused due to other priorities. It is still valid. |
Is this problem also happening with CMake builds or only in Make builds? |
@sandeep-krishnamurthy Please change the label from [pr-awaiting-response, pr-work-in-progress] to [pr-awaiting-testing, pr-work-in-progress] |
Hi @marcoabreu, Any update on this PR? Are you still working on it? |
ping @marcoabreu any updates? |
Yes, I will come back to this as soon as I got a few hours |
@marcoabreu this PR has been inactive for 3 months now. |
9977728
to
720caaa
Compare
0621193
to
5a59f07
Compare
98a2aff
to
9e1e700
Compare
Replaced by #13456 |
Description
#11516
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments