Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I properly use the caching feature? #42

Closed
tdegeus opened this issue Mar 4, 2022 · 27 comments
Closed

How do I properly use the caching feature? #42

tdegeus opened this issue Mar 4, 2022 · 27 comments

Comments

@tdegeus
Copy link
Contributor

tdegeus commented Mar 4, 2022

From the readme it is not very clear to me have to enable the download cache.
I tried:

    - name: Set conda environment
      uses: mamba-org/provision-with-micromamba@main
      with:
        environment-file: environment.yaml
        environment-name: myenv
        cache-env: true

but this gives

Cache miss for key 'micromamba-bin https://micro.mamba.pm/api/micromamba/win-64/latest Fri Mar 04 2022'

Also it is unclear to me if I'd rather should use cache-env or cache-downloads

@wolfv
Copy link
Member

wolfv commented Mar 4, 2022

I think it's expected to have a cache miss on the first run, and then on a daily basis (by default cache is onyl valid for a day). Did you see cache misses after the first run?

Maybe @jonashaag can give some more details!

@jonashaag
Copy link
Contributor

Re: what to choose, see #38 (comment), curious to hear the thoughts of both of you on that as well.

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 5, 2022

This feature is actually amazing!! In hindsight the docs do make sense I guess. Maybe I was just put off by the error that appeared when I did not expect it. Or maybe the error could be formulated more gently, like 'did not find a cache yet, but it may be available on your next run'.

I am still doubting about one thing though. And that is when I have

jobs:

  standard:

    strategy:
      fail-fast: false
      matrix:
        runs-on: [ubuntu-latest, macos-latest, windows-latest]

    defaults:
      run:
        shell: bash -l {0}

    name: ${{ matrix.runs-on }} • x64 ${{ matrix.args }}
    runs-on: ${{ matrix.runs-on }}

    steps:

    - name: Basic GitHub action setup
      uses: actions/checkout@v2

    - name: Set conda environment
      uses: mamba-org/provision-with-micromamba@main
      with:
        environment-file: environment.yaml
        environment-name: myenv
        cache-env: true

Is it actually clever enough to cache per platform?


I ask because for example after the last run in the PR on the main branch I still get

Cache miss for key 'micromamba-env win-x64 Sat Mar 05 2022 a3aff841031df89cce867ffae2bb89aed333bf62acf28b3d11ffdf6ddd884d7d-4f53cda18c2baa0c0354bb5f9a3ecbe5ed12ab4d8e11ba873c2f11161202b945'

Ref: /~https://github.com/tdegeus/GooseBib

@jonashaag
Copy link
Contributor

Yes it caches by platform by default. We should add that to the docs, and also make that message a bit less scary, as you suggested.

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 7, 2022

One more clarification on this issue. The caching work on:

  • repeated runs on a branch
  • repeated runs on the main branch

It does however not work on the main branch after merging a PR (whereas it could have used the last know environment)

@jonashaag
Copy link
Contributor

Can you show an example?

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 7, 2022

/~https://github.com/tdegeus/GooseMPL (last run on main) and tdegeus/GooseMPL#39

@jonashaag
Copy link
Contributor

The latter one has failed CI. Can you send links to a pair of successful actions runs that show this behavior?

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 8, 2022

@jonashaag
Copy link
Contributor

Could it be that a former CI run wasn't finished while you made a new commit?

/~https://github.com/tdegeus/FrictionQPotFEM/runs/5465250701?check_suite_focus=true

Notice: Unable to reserve cache with key micromamba-env linux-x64 Tue Mar 08 2022 d13ff70ae5ac01a0d4c9712c006ad60c41e9c91b7478b553f665411f371de671-d78fcba2fc342307e203228894e1bf5711a1af5502a26d02d55cbe9a8c64ef35, another job may be creating this cache.

And here /~https://github.com/tdegeus/FrictionQPotFEM/runs/5465212025?check_suite_focus=true:

Cache saved with key: micromamba-env linux-x64 Tue Mar 08 2022 d13ff70ae5ac01a0d4c9712c006ad60c41e9c91b7478b553f665411f371de671-d78fcba2fc342307e203228894e1bf5711a1af5502a26d02d55cbe9a8c64ef35

So, I would expect that you now have a non-empty micromamba-env linux-x64 Tue Mar 08 2022 d13ff70ae5ac01a0d4c9712c006ad60c41e9c91b7478b553f665411f371de671-d78fcba2fc342307e203228894e1bf5711a1af5502a26d02d55cbe9a8c64ef35 cache entry and when you run CI again it should use that cache.

@jonashaag
Copy link
Contributor

Testing that claim here tdegeus/FrictionQPotFEM#142

If it's indeed the case, do you have any suggestions how to improve the messaging so that it's less confusing? The Unable to reserve cache message comes from GitHub Actions or the GitHub cache action that we use, not from our code.

@jonashaag
Copy link
Contributor

Hmm, the timestamps are far apart, maybe what I hypothesised above isn't true:

Tue, 08 Mar 2022 13:51:23 GMT
Cache saved with key
Tue, 08 Mar 2022 13:55:28 GMT
Notice: Unable to reserve cache

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 8, 2022

I was just about the comment that ;)
Also, I'm seeing this very consistently across many repositories.
(Other than this caching is working fine, e.g. while creating a PR after a run was done on main)

@jonashaag
Copy link
Contributor

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 8, 2022

Indeed. I'm really only seeing a failure to hit when (on a separate day) the CI first runs (and succeeds) in a PR, and then reruns on the main after the PR is merged.
Thereafter rerunning on the main or in a new PR is successful.

@jonashaag
Copy link
Contributor

on a separate day

The cache is invalidated daily by default so this is no surprise.

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 8, 2022

Yes. I meant: Starting without a cache.

  1. Run CI successfully on a branch in a PR -> creates cache
  2. Rerun CI on that branch -> cache found
  3. Merge PR: triggers run on main branch -> cache not found
  4. Rerun on main branch -> cache found
  5. Run on new branch in new PR -> cache found

@jonashaag
Copy link
Contributor

Aha, I see what you mean. Actually that's unexpected, but maybe not so unexpected because I just found this:

https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache

A workflow can access and restore a cache created in the current branch, the base branch (including base branches of forked repositories), or the default branch (usually main).

I'm interpreting this as: Cache entries created in a PR will not be available after merging. Which is a bummer when you only have 1 PR a day or something, you will never actually hit the cache. Maybe we should increase the default cache TTL.

@tdegeus
Copy link
Contributor Author

tdegeus commented Mar 8, 2022

I see. It is indeed expected what I see, which is indeed sometimes a pity. Thanks for clarifying this.

Increasing the time could help, but I find it difficult to say what would be reasonable. One cannot have a bunch of data hanging around too long neither.

@SimonHeybrock
Copy link

Thanks for the caching feature! Seems to work well, but for env sizes of several hundred MByte we experience quite long (e.g., 2 minutes for 450 MByte) times for packing and unpacking the cache. Could this be due to the compression library that is being used internally?

@jonashaag
Copy link
Contributor

jonashaag commented Mar 25, 2022

Are you experiencing those long durations on Windows? I think a lot of time on Windows is spent in gunzip and simply writing the 10000s of files to disk, both of which we have no control over. Would be curious to see if it's slower than installing without cache though.

@jonashaag
Copy link
Contributor

Assuming this is what you're working on atm, will have a look scipp/scipp#2512

@SimonHeybrock
Copy link

Are you experiencing those long durations on Windows?

Yes, it is Windows. It is slightly faster than without cache, maybe 30-50% (there is a lot of noise in the timings, so I cannot tell for sure)?

@SimonHeybrock
Copy link

Assuming this is what you're working on atm, will have a look scipp/scipp#2512

In particular scipp/scipp#2512 (comment), which lists some "benchmark" results.

@jonashaag
Copy link
Contributor

I've seen much larger speedups on non-Windows

@SimonHeybrock
Copy link

I'll give those a try then (Windows has always been slowing us down, so that is what I tested first). Cheers for the useful input!

@jonashaag
Copy link
Contributor

Good luck! On Windows I'm pretty sure the bottleneck is just that the filesystem is SO slow. GH might should switch to zstd there as well though, for faster decompression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants