-
-
Notifications
You must be signed in to change notification settings - Fork 346
Regroup CI doctests and adjust unit-tests
recipe
#1996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This seeks to make improvements in four overlapping areas: - The CI `test-fast` job for `windows-latest` had been taking the longest, and integrating PRs would be more efficient if it could be sped up. If it didn't have to build and run doctests, then it would run markedly faster. `test-fast` runs doctests because... - The `unit-tests` recipe in the `justfile`, which is one of the recipes the "full" CI `test` job runs via the `ci-test` recipe, runs a number of `nextest` commands on individual crates (with `-p`, except for testing the top-level `gitoxide` crate, and not with `--workspace`). It also ran doctests, but only on the `gitoxide` top-level crate. But the `gitoxide` crate currently has no doctests, while some `gix-*` crates do. - On CI, we usually prefer `--no-fail-fast`. But it is not always used, because the commands in the `unit-tests` justfile recipe do not use it. `--no-fail-fast` is not always preferred when running tests locally, but... - Both locally and on CI, in most cases that a test fails in a commmand in the `unit-tests` justfile recipe, the effect is that tests have run and results have been reported for multiple `nextest` commands, yet not all the tests specified in the most recent `nextest` command to run. Thus, omitting `--no-fail-fast` may not have the most intuitive effect in `just unit-tests`, even when run locally (even if the user would omit `--no-fail-fast` in individual `cargo nextest` runs carried out manually). This commit makes the following changes: 1. Add `--no-fail-fast` to each of the commands in the `unit-tests` recipe in the `justfile`: the numerous `cargo nextest` commands, as well as the `cargo test` command used to run doctests. 2. Add `--workspace` to the `cargo test` command used to run doctests in the `unit-tests` recipe in the `justfile`, and move it to the end of the recipe. 3. Not to be confused with that `cargo test` command, move the other `cargo test` command used to run doctests in the `ci.yml` workflow (which alredy passed `--workspace`, as its purpose was to run all doctests in all crates) from the `tests-fast` job definition into the `test-32bit-windows-size` job, and rename that latter job `test-32bit-windows-size-doc` accordingly. The rationale for (3) may not be obvious. The idea is: - Running the doctests on only one Unix-like system should be enough, so long as they are run for all crates in the workspace. So the change in the `unit-tests` recipe in the `justfile` makes it so the CI `test` job (which includes a `unit-tests` run) covers doctests sufficiently, *except* for Windows. - Although we should probably keep running doctests regularly on Windows, removing it from `test-fast`, including on Windows, is the simplest way to make the Windows `test-fast` job run faster. (It also makes the job definition clearer, since some of the other steps relate to each other more closely than they do to the step that ran the doctests.) - It should be sufficient to run the doctests in any Windows environment. And it is best to avoid adding a new Windows job just for this, since various other Windows jobs might be added sometime soon (such as for ARM64, native Windows containers, the Git for Windows SDK, MinGit, BusyBox MinGit, and possibly others; some of these may be possible to combine, but likely a few more Windows jobs may be introduced for these, so avoiding adding extra Windows jobs now may make it easier to avoid having too many Windows jobs, in terms of queuing, GHA cache usage, energy usage, and other resources). So if this can be added to another Windows job without causing problems, that is preferable. - The Windows job that took the least amount of time, usually by several minutes, was the `test-32bit-windows-size` job. It is hope that this keeps the benefits of GitoxideLabs#1556, GitoxideLabs#1559, and GitoxideLabs#1654, while improving CI testing performance most of the time.
It's hard to predict what effect caching dependencies (with |
Thanks so much! The CI run took 17 minutes, which is less than the 30 minutes I thought it would take by now. Overall wallclock time is 1h27m, which seems close to what I remember though. |
No problem!
Hopefully it will be even less in most runs. I think most runs took about that long in from the start of the first job to the end of the last job, before the changes here.
Even though Windows is currently the slowest platform for running the tests, and the overall trend is to do more things on Windows, as well as to have more test cases and more fixtures across all platforms... nonetheless I think the overall trend of total time from the first job starting to the last job ending should be a gradual decrease, because the runner machines get gradually faster, and because we try not to pack too much into an individual job. For an example of the latter that is unrelated to Windows, the So if it looks like the time is approaching 30 minutes, I would consider that to be a bug, unless there is a strong reason for it. (Please feel free to ping me anytime about performance degradations on CI, as well as anything else you think may be of interest.)
I think by this you mean the sum of the wall-clock durations of all the jobs, i.e. the time they would take if they were not run concurrently, which is longer than the time they take when measured on a single clock. If so, it makes sense to me that this has not much improved, at least not by anything in this PR. This PR redistributes the work across jobs more evenly. But it probably doesn't decrease the total work. It does stop running doctests on macOS in I think it may make sense to try to decrease this total time--especially since the likely addition of new jobs, such as Windows jobs along the "various other Windows jobs" examples given in ee55ded, may increase it if their contribution is not in some way offset by speed improvements elsewhere. But that's not part of what this PR specifically attempted. (That is also something I think should be considered together with other resources. I think the total combined running time may be less important than total energy usage, which is somewhat feasible to estimate; queuing delays and runner availability; GitHub Actions cache storage; workflow maintainability; and the ability to adapt the workflow to changes such as new runners or, in case it should ever be needed, different CI solutions.) |
Thank you, appreciated!
Yes, this is what I meant - I should have written total runtime. And even though seeing that decrease is good for the environment, it's nothing we should necessarily optimise for as wallclock time is the thing that should be reduced (just not at all cost). And it's true that the total runtime doesn't directly correlate to energy usage either. |
This distributes work across CI jobs more evenly. In particular, it speeds up
test-fast
onwindows-latest
(which had become the slowest of all CI jobs, including other Windows jobs). To do so, it leverages the idea that the doctests are important enough that we should probably continue automatically running them on Windows, but not so important that it particularly matters which of reasonably normal Windows environments and targets they run for.Those changes entail modifying the
justfile
, so I've included the further overlapping change of adding--no-fail-fast
to the commands in theunit-tests
recipe there. This benefits CI, and I think does not hurt local runs either.This also makes it so that the doctest-running command in
just unit-tests
runs more than zero doctests, which I think is the more intuitive effect. (That is, it runs all doctests in the workspace, rather than just all the doctests in thegitoxide
crate, of which there currently aren't any.)Substantial further details, including about what this is trying to improve, and rationale, are in the commit message.
The diff, as shown on GitHub, is somewhat confusing in
justfile
. Except for one line that is moved and changed, all changed lines are actually changed only by having--no-fail-fast
added to the end.