ci: add a flaky tests failure report to our discord notification #2496

divagant-martian · 2024-07-12T21:05:44Z

Description

Modifies the flaky workflow to generate a report with all the tests that failed per matrix combo, so that we can receive it on discord. This requires to modify the tests workflow as well to generate libtest json reports.

Reports are uploaded as job artifacts with an unique name, and retained for a day.

From my last commit at the time of writing, this is the report we get:

Flaky tests failure:

- **ubuntu-latest all stable**
iroh-net::iroh_net::discovery::local_swarm_discovery::tests::test_local_swarm_discovery
ubuntu-latest default stable
iroh-cli::cli::cli_bao_store_migration
- **windows-latest all stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest default stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest none stable**
iroh-cli::cli::cli_provide_tree_resume
iroh-cli::cli::cli_provide_file_resume

See /~https://github.com/n0-computer/iroh/actions/workflows/flaky.yaml

which reads clear enough in discord imo

Breaking Changes

n/a

Notes & open questions

there is another test report which is more widely used but for which there is not a single decent parser I could find. The format is Jenkins' junit xml format. From my searches, everyone knows how to produce these files, but not how to read them. Since this is xml, which is not exactly compatible with serde, reading these kind of files was a task I considered not worth doing for what we actually want to achieve, which is simply more visibility over flaky tests.
That being said, libtest format is not stable, and the nextest feature to obtain it is unstable as well. It does not seem to change quickly or drastically at all, so I think we will be fine for some time. Being json the underlying format, it should be easy to adapt if necessary.

Change checklist

Self-review.
~~Documentation updates following the style guide, if relevant.~~
~~Tests if relevant.~~
~~All breaking changes documented.~~

.github/workflows/flaky.yaml

dignifiedquire

the results are great, for github action yml review code, I delegate to others 😅

flub

Very nice! Love the result

.github/workflows/tests.yaml

.github/workflows/flaky.yaml

divagant-martian · 2024-07-15T18:29:37Z

@flub from your comments I see you want to take this further by running statistics. I don't have a problem with that but I hope we don't actually ever need it. Our focus should be into reducing the flakes rather than improving the reports. Of course, given history is more than reasonable to be skeptical.

That being said, changing this would also require changes to the flaky workflow since the assumptions is that if we download a report, then it means it contains at least one failure. Uploading always would produce a report with matrix names but no content (no test failures).

So I suggest we increase the retention period, provided @Arqu gives us a good-to-go on a reasonable number of days and deal with statistics in a later PR

Arqu

🔥

As for the stats/report retention. I don't think it's a piece of work worth our time currently but probably a good addition somewhere down the line.

Think the max retention is 90 days. Given these are super light I would stick it to something like 45 days. That's plenty of history IMHO, and wont really clog up much of our storage usage.

Arqu · 2024-07-15T21:43:09Z

😅

flub · 2024-07-16T09:09:21Z

.github/workflows/tests.yaml

+      with:
+        name: libtest_run_${{ github.run_number }}-${{ github.run_attempt }}-${{ matrix.name }}_${{ matrix.features }}_${{ matrix.rust }}.json
+        path: output
+        retention-days: 1


missed this one 🫤

flub · 2024-07-16T09:11:15Z

@flub from your comments I see you want to take this further by running statistics. I don't have a problem with that but I hope we don't actually ever need it. Our focus should be into reducing the flakes rather than improving the reports. Of course, given history is more than reasonable to be skeptical.

That being said, changing this would also require changes to the flaky workflow since the assumptions is that if we download a report, then it means it contains at least one failure. Uploading always would produce a report with matrix names but no content (no test failures).

So I suggest we increase the retention period, provided @Arqu gives us a good-to-go on a reasonable number of days and deal with statistics in a later PR

I think the approach you took of keeping things a bit longer just in case we want to do anything with them is fine for now. This is a great improvement and I'm all for small improvements as we need them. So we can do statistics when we want to handle statistics.

But also your point of we actually strive for 0 is very good.

Thanks for doing all this!

## Description Modifies the `flaky` workflow to generate a report with all the tests that failed per matrix combo, so that we can receive it on discord. This requires to modify the `tests` workflow as well to generate `libtest` json reports. Reports are uploaded as job artifacts with an unique name, and retained for a day. From my last commit at the time of writing, this is the report we get: ```md Flaky tests failure: - **ubuntu-latest all stable** iroh-net::iroh_net::discovery::local_swarm_discovery::tests::test_local_swarm_discovery ubuntu-latest default stable iroh-cli::cli::cli_bao_store_migration - **windows-latest all stable** iroh-cli::cli::cli_provide_file_resume iroh-cli::cli::cli_provide_tree_resume - **windows-latest default stable** iroh-cli::cli::cli_provide_file_resume iroh-cli::cli::cli_provide_tree_resume - **windows-latest none stable** iroh-cli::cli::cli_provide_tree_resume iroh-cli::cli::cli_provide_file_resume See /~https://github.com/n0-computer/iroh/actions/workflows/flaky.yaml ``` which reads clear enough in discord imo ## Breaking Changes n/a ## Notes & open questions - there is another test report which is more widely used but for which there is not a single decent parser I could find. The format is Jenkins' junit xml format. From my searches, everyone knows how to produce these files, but not how to read them. Since this is xml, which is not exactly compatible with serde, reading these kind of files was a task I considered not worth doing for what we actually want to achieve, which is simply more visibility over flaky tests. - That being said, `libtest` format is not stable, and the `nextest` feature to obtain it is unstable as well. It does not seem to change quickly or drastically at all, so I think we will be fine for some time. Being json the underlying format, it should be easy to adapt if necessary. ## Change checklist - [x] Self-review. - [ ] ~~Documentation updates following the [style guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text), if relevant.~~ - [ ] ~~Tests if relevant.~~ - [ ] ~~All breaking changes documented.~~

divagant-martian added 30 commits July 11, 2024 10:32

first step: write the file

5704154

do not try to read the file?

778dd5d

set output name in env?

87c608c

create outputs for now?

671b44a

just get what we need

aa85d08

do not save it, since if it fails it won't run

113fea6

magically all tests are no longer flaky

3e0c9cc

try the upload side

5ed1ba6

just crate the file

5a99649

upload again

d690291

fix spaces

eede484

fix artifact name

6998af4

fix artifact path

f3d3833

if failure didn't work

f01d00e

attempt to reduce noise in file

914ff85

can't run and use at the same time

5275ee6

fix symbol

b0b399b

attempt to download

8f0e69e

kill windows, reduce tested area

5d2cf72

attempt to read

c3c915f

still no run + use

38e257b

quicker dbg

02f864d

fix glob

e6776a3

where are my files

a4d7c65

fix glob again

a1797bc

name must be set I guess??

e6f4555

filter failures

25aa997

inspect reports

73e3d41

inspect with jq

e8860de

inspect with jq

cea5298

flub reviewed Jul 13, 2024

View reviewed changes

.github/workflows/flaky.yaml Outdated Show resolved Hide resolved

divagant-martian and others added 5 commits July 13, 2024 10:06

add comments, add missing symbol

a3eca6c

final formatting??

ad53e08

small tweaks

2e144b8

why do you add spaces to my stuff

fd96399

Merge branch 'main' into d/flaky-test-initial-report

1cc809e

divagant-martian marked this pull request as ready for review July 15, 2024 15:57

divagant-martian removed the flaky-test label Jul 15, 2024

divagant-martian changed the title ~~[wip] ci: flaky report~~ ci: add a flaky tests failure report to our discord notification Jul 15, 2024

divagant-martian requested review from flub, dignifiedquire and Arqu July 15, 2024 16:01

dignifiedquire approved these changes Jul 15, 2024

View reviewed changes

flub approved these changes Jul 15, 2024

View reviewed changes

.github/workflows/tests.yaml Show resolved Hide resolved

.github/workflows/tests.yaml Outdated Show resolved Hide resolved

.github/workflows/flaky.yaml Outdated Show resolved Hide resolved

change comment about random EOF to explain why, instead of how

d94b7b6

divagant-martian added the flaky-test label Jul 15, 2024

Merge branch 'main' into d/flaky-test-initial-report

c0ecef7

divagant-martian removed the flaky-test label Jul 15, 2024

Arqu approved these changes Jul 15, 2024

View reviewed changes

add 45 days of rentention as said by _the expert_

09f2ce5

divagant-martian enabled auto-merge July 15, 2024 21:39

divagant-martian added this pull request to the merge queue Jul 15, 2024

Merged via the queue into main with commit f84c06e Jul 15, 2024
26 checks passed

divagant-martian deleted the d/flaky-test-initial-report branch July 15, 2024 22:50

flub reviewed Jul 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add a flaky tests failure report to our discord notification #2496

ci: add a flaky tests failure report to our discord notification #2496

divagant-martian commented Jul 12, 2024 •

edited

Loading

dignifiedquire left a comment

flub left a comment

divagant-martian commented Jul 15, 2024

Arqu left a comment

Arqu commented Jul 15, 2024

flub Jul 16, 2024

flub commented Jul 16, 2024

ci: add a flaky tests failure report to our discord notification #2496

ci: add a flaky tests failure report to our discord notification #2496

Conversation

divagant-martian commented Jul 12, 2024 • edited Loading

Description

Breaking Changes

Notes & open questions

Change checklist

dignifiedquire left a comment

Choose a reason for hiding this comment

flub left a comment

Choose a reason for hiding this comment

divagant-martian commented Jul 15, 2024

Arqu left a comment

Choose a reason for hiding this comment

Arqu commented Jul 15, 2024

flub Jul 16, 2024

Choose a reason for hiding this comment

flub commented Jul 16, 2024

divagant-martian commented Jul 12, 2024 •

edited

Loading