Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Improve CCache handling #13456

Merged
merged 16 commits into from
Dec 14, 2018
Merged

Improve CCache handling #13456

merged 16 commits into from
Dec 14, 2018

Conversation

marcoabreu
Copy link
Contributor

@marcoabreu marcoabreu commented Nov 29, 2018

This PR improves our ccache access. Particularly, we are now able to able to cache NVCC calls (which are super expensive) and also add caching to the compilation of submodules (especially ps-lites' submodule zeroMQ likes to hardcode its compilers).

Just to give some numbers:

  • ARMv8 improved from 35s to 14s.
  • Dependency compilation on cpu and gpu got recuced from 7 minutes to 10 seconds
  • TODO: Expand

Full NVCC Ccache support for CMake is blocked by #13459

@marcoabreu marcoabreu requested a review from szha as a code owner November 29, 2018 14:39
@marcoabreu
Copy link
Contributor Author

CCache support for NVCC in CMake needs an upgrade. Tracked at #13459

@vandanavk
Copy link
Contributor

@mxnet-label-bot add [CI, pr-awaiting-review]

@marcoabreu marcoabreu added CI pr-awaiting-review PR is waiting for code review labels Nov 29, 2018
@marcoabreu marcoabreu added pr-work-in-progress PR is still work in progress and removed pr-awaiting-review PR is waiting for code review labels Nov 29, 2018
@marcoabreu marcoabreu mentioned this pull request Nov 29, 2018
5 tasks
@marcoabreu
Copy link
Contributor Author

Ready for review. I will remove the Jenkinsfile changes when I'm done

@marcoabreu marcoabreu added pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress labels Nov 29, 2018
@marcoabreu marcoabreu changed the title [WIP] New ccache Improve CCache handling Nov 29, 2018
# Later on, we have to override the links because underlying build systems ignore our compiler settings. Thus,
# we have to give the process the proper permission to these files. This is hacky, but unfortunately
# there's no better way to do this without patching all our submodules.
chown -R jenkins_slave /usr/local/bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear, even with comment I don't understand why. Why can't we have some other folder added to the PATH?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, that's why I created /tmp/ccache-redirects. The problem is that in some places (especially submodules), that paths have been hardcoded. That's why we have to do both :(

ln -s ccache /tmp/ccache-redirects/clang-5.0
ln -s ccache /tmp/ccache-redirects/clang++-6.0
ln -s ccache /tmp/ccache-redirects/clang-6.0
ln -s ccache /usr/local/bin/gcc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you set some other folder in the path????

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did. But sometimes (in submodules) /usr/local/bin/gcc is referenced directly. In that case, my folder /tmp/ccache-redirects would be ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which submodules? Shouldn't we better fix them? This seems hacky, even though is great that you got it to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ps-lite and ZeroMQ are the first ones that stuck to my mind. I didn't benchmark every single one.

In total, we spend about 5-8 minutes compiling submodules. After this PR it's 20 seconds.

Copy link
Contributor

@larroy larroy Dec 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried export CXX and CC? I don't see any references to /usr/local/bin/gcc in our source tree.

Could you point out where in the build files is the problem? This still seems super hacky and fragile to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we used that method previously but zeromq ignored it for example. Have you searched for references to "GCC"? Zeromq is downloaded dynamically during the build (it's a zip file), so a grep would not find it.

Well that's the recommended way how ccache writes it in their guide. In the end, you can't control all build systems

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

larroy added a commit to larroy/mxnet that referenced this pull request Apr 5, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 6, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 9, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 10, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 10, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 16, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 17, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 19, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 19, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 19, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 22, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 24, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 24, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 24, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 25, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Apr 29, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 2, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 2, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 2, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 6, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 14, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 14, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 17, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 20, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 21, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request May 21, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Jun 11, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
larroy added a commit to larroy/mxnet that referenced this pull request Jun 20, 2019
* Fix broken links
* Make it idempotent
fixes apache#13456
fixes apache#14117
fixes apache#11516
marcoabreu pushed a commit that referenced this pull request Jun 24, 2019
* Fix broken links
* Make it idempotent
fixes #13456
fixes #14117
fixes #11516
@larroy
Copy link
Contributor

larroy commented Jan 16, 2020

@josephevans please review, you said you got ccache to work with nvcc.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CI pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants