Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate creating Windows installers for GAP releases with GitHub Actions #4389

Merged
merged 1 commit into from
Apr 26, 2021

Conversation

wilfwilson
Copy link
Member

@wilfwilson wilfwilson commented Apr 11, 2021

(Phew, this was a lot of work, building on what must've been a lot of work by @ChrisJefferson (who was building on a lot of work by @embray and perhaps others at Sage...) and the others involved in the creation of GAP's existing GitHub Actions release system, and I'm sure many others too. Thank you everyone! Open source software collaboration can be nice. 🙂)

This PR uses my fork* of @ChrisJefferson's fork of SageMath's sage-windows repository to create Windows installers for GAP, and in particular, when a new version of GAP is released this creates an installer from the gap-X.Y.Z.tar.gz archive produced by the current release system, and adds it to the GitHub release.

It adds a new cygwin job to the releases workflow, along with two new short Python scripts in dev/releases that it uses. The existing releases job is renamed to unix. The new workflow job is heavily commented, so if you are reviewing, please read those comments carefully, in which I describe what the steps are doing, and in which I describe several possible changes/improvements to the way that it works (which would possibly remove the need for the two new Python scripts).

In summary:

  • This new cygwin job runs after each run of the unix job (which creates all of the GAP/package tarballs and zip files).
  • In all cases except the PR, the job runs twice: once to make a 32-bit GAP installer, and once to make a 64-bit GAP installer.
  • In more detail:
    • If on: pr, it wraps the merge commit of the GAP PR, with only the required packages, into a 64-bit installer.
    • If on: push: tags, it wraps the gap-X.Y.Z.tar.gz that it gets from the just-created GitHub release, and uploads the installers to the release.
    • Otherwise, it wraps the head of the appropriate branch, with the packages given by make bootstrap-pkg-full. If the trigger is on: schedule, then it also uploads the installers as artifacts (with a retention period of 1 day)

* I have opened a PR ChrisJefferson/sage-windows#1 to merge my fork into Chris's fork, and eventually Chris's fork will be put somewhere more appropriate.

@wilfwilson wilfwilson added os: windows Issues and PRs that are (at least partially) specific to Windows do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes topic: workflow Meta: anything related to the development workflow of GAP labels Apr 11, 2021
@wilfwilson wilfwilson changed the title Automation creating Windows installers for GAP releases with GitHub Actions Automate creating Windows installers for GAP releases with GitHub Actions Apr 11, 2021
@wilfwilson wilfwilson removed the do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) label Apr 11, 2021
Copy link
Contributor

@ssiccha ssiccha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, this looks like it was a ton of work @wilfwilson @ChrisJefferson.

Disclaimer: I haven't looked into the sage-windows repository for more than 30 seconds.

I've left a few minor comments. Besides from those, everything looks fine to me.
I also added the label infrastructure.

dev/releases/make_archives.py Outdated Show resolved Hide resolved
dev/releases/download_release_archive.py Show resolved Hide resolved
dev/releases/upload_file_to_github_release.py Outdated Show resolved Hide resolved
.github/workflows/release.yml Outdated Show resolved Hide resolved
.github/workflows/release.yml Outdated Show resolved Hide resolved
@fingolfin
Copy link
Member

fingolfin commented Apr 12, 2021

Looking at /~https://github.com/wilfwilson/gap/actions/runs/738243905 it took about ~1 hour to produce the windows binaries. So I am wary about running this too often.

Looking at the x86 builds, it takes about 10 minutes of preparation before it gets to building GAP. I wonder if the GH Actions cache system could be used to help with that?

Compiling all packages: some of them fail (e.g. CddInterface), so once #4386 is merged, that may lead to failures here (?).

I think compilation of all packages then finishes after another 30 minutes, so roughly 40 minutes in total were spent so far. The log visible on the website then ends as is too long; it says I can download the full log, but I didn't do that yet.

So I wonder if https://ccache.dev works on Windows with Cygwin; if we could use it and use the GitHub cache system on the ccache, it might reduce the compilation time. Just a thought, though, it may be too much of a hassle to get working.

BTW, in the x86 logs, I saw this (didn't look at the x86_64 logs); not sure if it is "normal" or we should worry about it:

# Move the tmpdir into the final environment location
mkdir -p envs
mv /tmp/tmp.FemAgZRzTO envs/build-stable-cygwin-x86
mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/pki/ca-trust/source/ca-bundle.legacy.crt': Permission denied
mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/hosts': Permission denied
mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/protocols': Permission denied
mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/services': Permission denied
mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/networks': Permission denied
mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/alternatives/automake-doc': Permission denied
mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/alternatives/python': Permission denied

@ChrisJefferson
Copy link
Contributor

My general feeling is there shouldn't be need to run this more than once a release, and maybe also once a week, so I'm not too worried about the time, also, any type of caching is dangerous with cygwin because if you cache binaries built against a different version of the cygwin DLL, bad things happen.

I do want to look at the remaining packages which aren't building yet -- first I'd like to get testing of all packages working in Cygwin, so I could easily see what is / isn't working, as I suspect some packages might be building but not running.

.github/workflows/release.yml Outdated Show resolved Hide resolved
.github/workflows/release.yml Outdated Show resolved Hide resolved
@fingolfin
Copy link
Member

I don't think running this just once per release is enough, we must run this regularly if we want to notice regressions quickly (experience tells that it then is much easier to resolve them).

Running it on each update of master may still be too slow, though; 2x 1 hour Windows jobs (which are much more limited than Unix jobs) could be a killer during "busy days". But we could of course try it out for now... otherwise, setting a daily cron trigger would also work.

In either case, we badly need CI failure notifications again, see issue #4390.

Another question: how much work would it be to split the "Compile GAP and its packages, and create the installer" step up? E.g. into these:

  1. setup cygwin (actually... didn't we do that before? It seems to do it again)
  2. build GAP
  3. build packages
  4. create the installed

That would help with pinpointing issues and generally make the GH Actions log a bit easier to navigate.

@wilfwilson wilfwilson added the do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) label Apr 12, 2021
@wilfwilson
Copy link
Member Author

wilfwilson commented Apr 12, 2021

I'm happy for there to be manually-dispatched workflow runs and scheduled workflow runs. That should be no problem, I'll implement those sometime, and we discuss the precise parameters that they should have.


In my opinion, the details of how the script of the sage-windows repository achieves its goals (i.e. the details of how release_gap.sh does its thing) are mostly independent of this PR.

The purpose of this PR is to implement a GitHub Actions Workflow that:

  • at an appropriate point in time (schedule? manual? release? PR? branch push?)...
  • gives the appropriate version of GAP to the installer-creation script(s)...
  • and appropriately handles the installers that it makes.

I think that the comments and details relating to the steps involved in the release_gap.sh script should be directed towards /~https://github.com/ChrisJefferson/sage-windows - in particular, we should primarily have those discussion on my PR to that repo ChrisJefferson/sage-windows#1. Note that that repository already has CI set up, so you can see over there what the release_gap.sh script is doing (it currently takes a couple of hours longer, because it also compiles GAP's manuals).

@wilfwilson
Copy link
Member Author

I will now respond on a high level to some of the discussions about resource usage and making things faster.

The summary of my initial suggestions for making this faster: Only build the 64-bit installer most of the time, and teach ChrisJefferson/sage-windows to skip installing LaTeX in certain cases (and possibly skip installing some other Cygwin components).

Other comments:

  • (I believe) that the release_gap.sh script needs to be run from within Cygwin. Therefore we need to set up Cygwin in this Workflow before calling the script. Currently, we do that with gap-actions/setup-cygwin-for-gap, which is overkill, because we're not actually going to run GAP in this Cygwin, but this step 'only' takes two minutes and I don't have the desire to work out which subset of Cygwin is actually necessary.

  • The remaining significant time taken by this Workflow is in the script release_gap.sh. As mentioned above, I think most discussions related to this script would be more appropriately held on Update to enable use in GAP release system ChrisJefferson/sage-windows#1.

  • The release_gap.sh script sets up its own new Cygwin with very specific properties (it possibly does this twice). I therefore doubt that we can re-use the Cygwin that we initially used to call the release_gap.sh script. In particular, this spends a lot of time installing TeX Live, which I believe we should be able to skip for the purposes of this Workflow, because (as far as I understand), we don't need to compile GAP's manuals in this workflow.

From #4389 (comment) (@fingolfin):

We must run this regularly if we want to notice regressions quickly...

I agree!

From #4389 (comment) (@ssiccha):

I'd say let's not run this on every pull request. It takes 1h 20mins compared to the CI tests' 50 mins.

No, under the current setup a GitHub Actions run of release.yml on a pull request takes ~42 minutes (see e.g. /~https://github.com/gap-system/gap/actions/runs/738246766), of which ~18 minutes is the already-existing release scripts, and then that is followed by the parallel Windows installer jobs, which each take about ~24 minutes. Which in my opinion is a reasonable length of time. We could further reduce our Windows usage by only building on one architecture (i.e. 64-bit) on PRs.

It takes less time on a PR because there I only do make bootstrap-pkg-minimal, and so there are no packages to compile, whereas on a branch push I do make bootstrap-pkg-full and there are lots of packages that are slow to compile.

From #4389 (comment) (@fingolfin):

Compiling all packages: some of them fail (e.g. CddInterface), so once #4386 is merged, that may lead to failures here (?).

#4386 only causes BuildPackages.sh to exit non-zero if you use the option --strict. But the release_gap.sh script uses BuildPackages.sh --parallel.

From #4389 (comment) (@ChrisJefferson):

I do want to look at the remaining packages which aren't building yet...

Sure – that's great 🙂!

I don't mind waiting for all packages to be working in Cygwin before merging this. But at the same time, I don't think broken packages should stop this being merged, because the Workflow runs successfully even with broken packages - it just means that those packages won't work properly in the resulting Windows GAP installation.

From #4389 (comment) (@fingolfin):

Another question: how much work would it be to split the "Compile GAP and its packages, and create the installer" step up?

I agree that it would be great to have this. But I think it would be a large amount of work, although I don't know. release_gap.sh consists of just a single call to make. So the Makefile would need to be restructured. I have zero desire to get involved in this in the near future, and would make further significant diffs against sagemath/sage-windows. @ChrisJefferson will know better about this, so I suggest further discussions on this topic should move to ChrisJefferson/sage-windows#1

@wilfwilson
Copy link
Member Author

Oops, forgot one. @fingolfin those lines of the form

mv: listing attributes of '/tmp/tmp.FemAgZRzTO/etc/services': Permission denied

also appear in the CI of /~https://github.com/ChrisJefferson/sage-windows, so they're not particular to this PR. I've got no idea of their importance.

@wilfwilson

This comment has been minimized.

@fingolfin

This comment has been minimized.

@wilfwilson

This comment has been minimized.

Copy link
Contributor

@ssiccha ssiccha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, under the current setup a GitHub Actions run of release.yml on a pull request takes ~42 minutes (see e.g. /~https://github.com/gap-system/gap/actions/runs/738246766), of which ~18 minutes is the already-existing release scripts, and then that is followed by the parallel Windows installer jobs, which each take about ~24 minutes. Which in my opinion is a reasonable length of time. We could further reduce our Windows usage by only building on one architecture (i.e. 64-bit) on PRs.

Ok, that sounds great! I have no idea whether our usage of windows machines may become a bottleneck during GAP days. So I'm undecided on whether to include building 32 bit releases on every PR. However, we can still make that minor change in an extra PR.

@wilfwilson

This comment has been minimized.

Copy link
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for taking care of this!

@fingolfin
Copy link
Member

Since it's just 24 minutes for a PR build, I'd be fine with just enabling this on all PRs after all. Anyway, feel free to restore that, or merge this as-is.

@wilfwilson wilfwilson force-pushed the stable-cygwin branch 2 times, most recently from 6f92432 to 91a8a1e Compare April 15, 2021 21:30
@wilfwilson

This comment has been minimized.

@wilfwilson
Copy link
Member Author

I got the workflow_dispatch and schedule triggers to work fine (/~https://github.com/wilfwilson/gap/actions/workflows/release.yml?query=event%3Aschedule and /~https://github.com/wilfwilson/gap/actions/workflows/release.yml?query=event%3Aworkflow_dispatch).

The artifacts are currently only being uploaded on the daily scheduled job, and only living for one day (i.e. roughly until the next job runs). I might upload artifacts for workflow_dispatch too (schedule only builds the master branch; workflow_dispatch can build any branch that contains the workflow, I think).

I've still got PR jobs disabled for now until I work out how to best only build on architecture. I might enable them in a separate PR.

But the reason that I've still got the "DO NOT MERGE" label is that I have not actually tried running the resulting Windows installer(s). I don't currently have access to a Windows computer. But I'll work out a way, possibly ask some people to help.

@ssiccha

This comment has been minimized.

@wilfwilson

This comment has been minimized.

@ssiccha

This comment has been minimized.

@ssiccha

This comment has been minimized.

@wilfwilson

This comment has been minimized.

Copy link
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure what the state of this PR is now. What is left to be done, exactly? Is a decision needed on something, or is it just a matter of finding time to implement some already made decisions?

It may also help to take this in steps: e.g. some changes in this PR could probably already go in, e.g. those to make_archives.py look to me as if they are mostly cleanup and could mostly go in right away?

Regarding the 32 vs 64 bit issue: right now, I think we only test Cygwin 64 bit; so it would be a reasonable option if we for now just built the 64bit installer. And then open a separate PR/issue regarding the 32bit one. This way, we would reap the majority of the benefit now and postpone the icky technical discussions... I hope, at least? Of course that's just a suggestion, and it might be a bad one, I do not claim to fully understand the problem scope

.github/workflows/release.yml Outdated Show resolved Hide resolved
@wilfwilson

This comment has been minimized.

@wilfwilson wilfwilson force-pushed the stable-cygwin branch 3 times, most recently from 92f222a to 75c3677 Compare April 21, 2021 14:53
@wilfwilson
Copy link
Member Author

wilfwilson commented Apr 21, 2021

I think this is very nearly ready to be merged.

The jobs run on PR (64-bit only), push branch/tag, on a daily schedule (on the master branch), and manually.
For a scheduled run, it uploads artifacts that live for a day, and for a release tag, it upload the installers to the GitHub release.

You can see an up-to-date example of the workflow in this PR running:

I have asked @ChrisJefferson to install both installers from that v4.12.9 release on his Windows computer, and to have a quick check that the GAPs load and at least superficially seem to work.

If he reports no problems, then I'll remove the 'DO NOT MERGE' label and then merge, soon after (if no one beats me to it).

@wilfwilson wilfwilson removed the do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) label Apr 26, 2021
@wilfwilson
Copy link
Member Author

@ChrisJefferson tried this installers and (at least on a quick check!) it all seemed in order. So I'll merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
os: windows Issues and PRs that are (at least partially) specific to Windows release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes topic: infrastructure topic: workflow Meta: anything related to the development workflow of GAP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants