Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add a flaky tests failure report to our discord notification #2496

Merged
merged 61 commits into from
Jul 15, 2024

Conversation

divagant-martian
Copy link
Contributor

@divagant-martian divagant-martian commented Jul 12, 2024

Description

Modifies the flaky workflow to generate a report with all the tests that failed per matrix combo, so that we can receive it on discord. This requires to modify the tests workflow as well to generate libtest json reports.

Reports are uploaded as job artifacts with an unique name, and retained for a day.

From my last commit at the time of writing, this is the report we get:

Flaky tests failure:

- **ubuntu-latest all stable**
iroh-net::iroh_net::discovery::local_swarm_discovery::tests::test_local_swarm_discovery
ubuntu-latest default stable
iroh-cli::cli::cli_bao_store_migration
- **windows-latest all stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest default stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest none stable**
iroh-cli::cli::cli_provide_tree_resume
iroh-cli::cli::cli_provide_file_resume

See /~https://github.com/n0-computer/iroh/actions/workflows/flaky.yaml

which reads clear enough in discord imo

Breaking Changes

n/a

Notes & open questions

  • there is another test report which is more widely used but for which there is not a single decent parser I could find. The format is Jenkins' junit xml format. From my searches, everyone knows how to produce these files, but not how to read them. Since this is xml, which is not exactly compatible with serde, reading these kind of files was a task I considered not worth doing for what we actually want to achieve, which is simply more visibility over flaky tests.
  • That being said, libtest format is not stable, and the nextest feature to obtain it is unstable as well. It does not seem to change quickly or drastically at all, so I think we will be fine for some time. Being json the underlying format, it should be easy to adapt if necessary.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

.github/workflows/flaky.yaml Outdated Show resolved Hide resolved
@divagant-martian divagant-martian marked this pull request as ready for review July 15, 2024 15:57
@divagant-martian divagant-martian changed the title [wip] ci: flaky report ci: add a flaky tests failure report to our discord notification Jul 15, 2024
Copy link
Contributor

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the results are great, for github action yml review code, I delegate to others 😅

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Love the result

.github/workflows/tests.yaml Show resolved Hide resolved
.github/workflows/tests.yaml Outdated Show resolved Hide resolved
.github/workflows/flaky.yaml Outdated Show resolved Hide resolved
@divagant-martian
Copy link
Contributor Author

@flub from your comments I see you want to take this further by running statistics. I don't have a problem with that but I hope we don't actually ever need it. Our focus should be into reducing the flakes rather than improving the reports. Of course, given history is more than reasonable to be skeptical.

That being said, changing this would also require changes to the flaky workflow since the assumptions is that if we download a report, then it means it contains at least one failure. Uploading always would produce a report with matrix names but no content (no test failures).

So I suggest we increase the retention period, provided @Arqu gives us a good-to-go on a reasonable number of days and deal with statistics in a later PR

Copy link
Collaborator

@Arqu Arqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

As for the stats/report retention. I don't think it's a piece of work worth our time currently but probably a good addition somewhere down the line.

Think the max retention is 90 days. Given these are super light I would stick it to something like 45 days. That's plenty of history IMHO, and wont really clog up much of our storage usage.

@Arqu
Copy link
Collaborator

Arqu commented Jul 15, 2024

😅

@divagant-martian divagant-martian added this pull request to the merge queue Jul 15, 2024
Merged via the queue into main with commit f84c06e Jul 15, 2024
26 checks passed
@divagant-martian divagant-martian deleted the d/flaky-test-initial-report branch July 15, 2024 22:50
with:
name: libtest_run_${{ github.run_number }}-${{ github.run_attempt }}-${{ matrix.name }}_${{ matrix.features }}_${{ matrix.rust }}.json
path: output
retention-days: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed this one 🫤

@flub
Copy link
Contributor

flub commented Jul 16, 2024

@flub from your comments I see you want to take this further by running statistics. I don't have a problem with that but I hope we don't actually ever need it. Our focus should be into reducing the flakes rather than improving the reports. Of course, given history is more than reasonable to be skeptical.

That being said, changing this would also require changes to the flaky workflow since the assumptions is that if we download a report, then it means it contains at least one failure. Uploading always would produce a report with matrix names but no content (no test failures).

So I suggest we increase the retention period, provided @Arqu gives us a good-to-go on a reasonable number of days and deal with statistics in a later PR

I think the approach you took of keeping things a bit longer just in case we want to do anything with them is fine for now. This is a great improvement and I'm all for small improvements as we need them. So we can do statistics when we want to handle statistics.

But also your point of we actually strive for 0 is very good.

Thanks for doing all this!

matheus23 pushed a commit that referenced this pull request Nov 14, 2024
## Description

Modifies the `flaky` workflow to generate a report with all the tests
that failed per matrix combo, so that we can receive it on discord. This
requires to modify the `tests` workflow as well to generate `libtest`
json reports.

Reports are uploaded as job artifacts with an unique name, and retained
for a day.

From my last commit at the time of writing, this is the report we get:
```md
Flaky tests failure:

- **ubuntu-latest all stable**
iroh-net::iroh_net::discovery::local_swarm_discovery::tests::test_local_swarm_discovery
ubuntu-latest default stable
iroh-cli::cli::cli_bao_store_migration
- **windows-latest all stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest default stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest none stable**
iroh-cli::cli::cli_provide_tree_resume
iroh-cli::cli::cli_provide_file_resume

See /~https://github.com/n0-computer/iroh/actions/workflows/flaky.yaml
```
which reads clear enough in discord imo

## Breaking Changes
n/a

## Notes & open questions

- there is another test report which is more widely used but for which
there is not a single decent parser I could find. The format is Jenkins'
junit xml format. From my searches, everyone knows how to produce these
files, but not how to read them. Since this is xml, which is not exactly
compatible with serde, reading these kind of files was a task I
considered not worth doing for what we actually want to achieve, which
is simply more visibility over flaky tests.
- That being said, `libtest` format is not stable, and the `nextest`
feature to obtain it is unstable as well. It does not seem to change
quickly or drastically at all, so I think we will be fine for some time.
Being json the underlying format, it should be easy to adapt if
necessary.

## Change checklist

- [x] Self-review.
- [ ] ~~Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.~~
- [ ] ~~Tests if relevant.~~
- [ ] ~~All breaking changes documented.~~
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants