Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run performance tests on all PR against base branch #1630

Open
wants to merge 17 commits into
base: dev
Choose a base branch
from

Conversation

peaBerberian
Copy link
Collaborator

@peaBerberian peaBerberian commented Jan 14, 2025

This PR does multiple updates with our performance tests (which test for performance regressions):

  1. I simplified performance tests writing by adding a lib.js file handling test management - and I added some multithread tests.

  2. I now choose to run performance tests in CI for all PR

  3. Performance tests in CI are now run against the base branch (instead of against the last RxPlayer release )

  4. A comment is left on the GitHub PR if performance issues are detected

@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 6 times, most recently from 38fc9eb to 4fef17b Compare January 14, 2025 14:01
@peaBerberian peaBerberian force-pushed the misc/performance-tests-libs branch 2 times, most recently from 1f99e84 to b5f4e63 Compare January 14, 2025 14:19
@peaBerberian peaBerberian changed the base branch from misc/performance-tests-libs to dev January 14, 2025 14:20
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 9 times, most recently from 963e4cc to fa4dc9e Compare January 17, 2025 09:48
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 6 times, most recently from 4e13626 to fdb9406 Compare January 17, 2025 14:13
@canalplus canalplus deleted a comment from github-actions bot Jan 17, 2025
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch 4 times, most recently from 171f783 to d330f9d Compare January 17, 2025 14:53
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch from 830c0d6 to 4896468 Compare January 21, 2025 15:27
Copy link

Automated performance checks have found a sensible difference on commit 4896468988a863f1dc290bc065d5f9f3377e2c64 with the base branch dev:

Performance tests 1st run output:

No significative change in performance for tests:

  • loading mean: 19.28ms -> 19.41ms (-0.129ms, -0.666%, z: 1.47799) / median: 26.70ms -> 26.70ms
  • seeking mean: 12.04ms -> 14.77ms (-2.738ms, -18.532%, z: 1.16664) / median: 11.25ms -> 11.40ms
  • audio-track-reload mean: 25.96ms -> 26.08ms (-0.125ms, -0.478%, z: 1.08775) / median: 38.10ms -> 38.10ms
  • cold loading multithread mean: 48.07ms -> 47.49ms (0.589ms, 1.241%, z: 9.61539) / median: 70.20ms -> 69.15ms
  • seeking multithread mean: 12.62ms -> 23.30ms (-10.675ms, -45.815%, z: 0.25505) / median: 10.50ms -> 10.35ms
  • audio-track-reload multithread mean: 25.98ms -> 25.79ms (0.190ms, 0.736%, z: 2.12765) / median: 38.25ms -> 38.10ms
  • hot loading multithread mean: 15.30ms -> 15.24ms (0.059ms, 0.387%, z: 3.20104) / median: 22.20ms -> 22.05ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch from d3c562b to 773e28d Compare January 21, 2025 16:15
@Florent-Bouisset
Copy link
Collaborator

Automated performance checks have found a sensible difference on commit 4896468988a863f1dc290bc065d5f9f3377e2c64 with the base branch dev:

Performance tests 1st run output:

No significative change in performance for tests:

  • loading mean: 19.28ms -> 19.41ms (-0.129ms, -0.666%, z: 1.47799) / median: 26.70ms -> 26.70ms
  • seeking mean: 12.04ms -> 14.77ms (-2.738ms, -18.532%, z: 1.16664) / median: 11.25ms -> 11.40ms
  • audio-track-reload mean: 25.96ms -> 26.08ms (-0.125ms, -0.478%, z: 1.08775) / median: 38.10ms -> 38.10ms
  • cold loading multithread mean: 48.07ms -> 47.49ms (0.589ms, 1.241%, z: 9.61539) / median: 70.20ms -> 69.15ms
  • seeking multithread mean: 12.62ms -> 23.30ms (-10.675ms, -45.815%, z: 0.25505) / median: 10.50ms -> 10.35ms
  • audio-track-reload multithread mean: 25.98ms -> 25.79ms (0.190ms, 0.736%, z: 2.12765) / median: 38.25ms -> 38.10ms
  • hot loading multithread mean: 15.30ms -> 15.24ms (0.059ms, 0.387%, z: 3.20104) / median: 22.20ms -> 22.05ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

Maybe we can have it displayed as a table so it's more readable:

Action Mean Median
Loading 19.28ms → 19.41ms (-0.129ms, -0.666%n, z: 1.455) 26.70ms → 26.70ms
Seeking 19.28ms → 19.41ms (-0.129ms, -0.666%, z: 1.455) 26.70ms → 26.70ms

@peaBerberian
Copy link
Collaborator Author

peaBerberian commented Jan 21, 2025

OK though to make it also readable as a text file (the intent is that the outputed report markdown file is both intended for humans with a text editor and to scripts outputing optionally richer text formats - like as a Github comment) is more complex when doing a table vs a list.

But i'll try

@canalplus canalplus deleted a comment from github-actions bot Jan 21, 2025
@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch from 66d9814 to 23edde8 Compare January 21, 2025 17:42
@canalplus canalplus deleted a comment from github-actions bot Jan 21, 2025
Copy link

Automated performance checks have been performed on commit 5798686388a3dec9eb469f64e6b386b0d4358828 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 20.26ms -> 20.60ms (-0.337ms, z: 0.96412) 27.00ms -> 27.15ms
seeking 17.56ms -> 14.42ms (3.141ms, z: 2.34821) 11.10ms -> 11.25ms
audio-track-reload 26.22ms -> 26.43ms (-0.215ms, z: 0.20465) 37.65ms -> 37.65ms
cold loading multithread 48.59ms -> 47.85ms (0.736ms, z: 8.78710) 69.75ms -> 68.55ms
seeking multithread 20.70ms -> 17.48ms (3.221ms, z: 0.57881) 10.20ms -> 10.20ms
audio-track-reload multithread 26.30ms -> 26.02ms (0.280ms, z: 0.52428) 37.65ms -> 37.65ms
hot loading multithread 15.80ms -> 15.42ms (0.377ms, z: 3.90596) 22.05ms -> 21.75ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

@canalplus canalplus deleted a comment from github-actions bot Jan 21, 2025
Copy link

Automated performance checks have been performed on commit 628dc3be94709c1c96840e8d4b8a3d486ac58fd5 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 19.22ms -> 19.30ms (-0.079ms, z: 1.62917) 26.55ms -> 26.55ms
seeking 13.31ms -> 10.65ms (2.653ms, z: 0.09964) 11.25ms -> 11.25ms
audio-track-reload 25.84ms -> 25.88ms (-0.042ms, z: 0.13650) 37.65ms -> 37.65ms
cold loading multithread 47.96ms -> 47.10ms (0.856ms, z: 12.99353) 70.20ms -> 68.85ms
seeking multithread 16.55ms -> 18.57ms (-2.022ms, z: 0.72020) 10.50ms -> 10.35ms
audio-track-reload multithread 25.68ms -> 25.73ms (-0.050ms, z: 0.30880) 37.65ms -> 37.80ms
hot loading multithread 15.20ms -> 15.19ms (0.016ms, z: 3.84216) 22.10ms -> 21.90ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

@peaBerberian peaBerberian force-pushed the misc/performance-tests-all-pr branch from 628dc3b to 7458f02 Compare January 22, 2025 13:31
Copy link

Automated performance checks have been performed on commit 7458f0240d24c9dfacff9836a9e0dc7adeec0f4a with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 19.71ms -> 19.41ms (0.302ms, z: 4.48128) 27.45ms -> 26.85ms
seeking 12.70ms -> 17.37ms (-4.663ms, z: 0.69236) 11.25ms -> 11.25ms
audio-track-reload 26.02ms -> 26.18ms (-0.161ms, z: 1.42568) 37.95ms -> 38.10ms
cold loading multithread 48.19ms -> 47.66ms (0.529ms, z: 9.50722) 70.50ms -> 69.60ms
seeking multithread 18.64ms -> 17.93ms (0.703ms, z: 1.10124) 10.35ms -> 10.35ms
audio-track-reload multithread 26.04ms -> 25.80ms (0.242ms, z: 3.10859) 38.25ms -> 37.95ms
hot loading multithread 15.47ms -> 15.40ms (0.065ms, z: 3.70142) 22.50ms -> 22.20ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

*/
export function declareTestGroup(name, code, timeout) {
if (areTestsAlreadyRunning) {
error(`"declareTestGroup" function call not performed at top level.`);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the error message correct with what is being checked?

Copy link
Collaborator Author

@peaBerberian peaBerberian Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the idea is declareTestGroup is kind of like testing frameworks' describe functions.
Excepted that for now, I don't think it's possible to nest them in one another.

It is supposed to be called at top level as the library relies on them being all called when its scheduled test-running task 😉 begins (after a low timeout)

Technically you could have a call not at the top-level but still executed on file evaluation (e.g. inside a if in the top level) and that would be valid but the actual descriptive error message to throw here is hard to come up with so I chose "top level" as a concept that can easily be grasped without needing to have the whole architecture in mind.

} else {
testNumber = Number(location.hash.substring(1));
}
if (testNumber < 100) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what 100 is, is it the maximum number of tests that can be run?

Copy link
Collaborator Author

@peaBerberian peaBerberian Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to give some context:

  • The nodejs script in tests/perfomance.run.js is the test runner executed server-side (similar to vitest's globalSetup): here it runs one of the test page multiple time on the local Chrome binary

  • This script is the test library that will run in the browser alongside written tests (like vitest's JS lib when we do import { describe } from "vitest" ), it does the client-side test management logic (including sending HTTP requests to the nodejs script to report results) so test files can just be about the test scenarios

This testNumber is the number of time we will reload the page in the current browser. What we actually do here is to switch between the current.html and previous.html pages each time the tests for the current page are all finished.

Because we're reloading the page, we cannot rely on variables declared in that script to track that count. Here we use the "fragment" part of an URL to store the current iteration (.e.g. /current.html#34).
Once we've reached #100, we tell the nodejs script that we're done. It then kills the browser and either re-run another one or finish the tests.

I don't know if that's clear, it's a little complex as we retry the same tests a LOT of time in different processes at different time just to be sure we've some performance issue and it's not just a false positive due to temporary high CPU usage, some browser optimizations when similar code is repeated, or random noise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it interact with TEST_ITERATIONS = 30 from performance/run.mjs ?
Does it means the browser test is started 30 times and each test does 100 reloads ? So it's actually 30x100 reloads. Isn't it too many?

Copy link
Collaborator Author

@peaBerberian peaBerberian Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that must be the right count (technically 1500 with the previous and 1500 with the new one).

Too many for who?

That's to give more confidence into the final result.
When I looked at other benchmarking projects, 1500 attempts is in the very low tier.

The real issue I can see is that tests may run for 1h but that's not dramatic to me.

} else {
testNumber = Number(location.hash.substring(1));
}
if (testNumber < 100) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it interact with TEST_ITERATIONS = 30 from performance/run.mjs ?
Does it means the browser test is started 30 times and each test does 100 reloads ? So it's actually 30x100 reloads. Isn't it too many?

Comment on lines +224 to +229
if (results.better.length > 0) {
console.error(
"\nBetter performance at first attempt for tests:\n\n" +
formatResultInHumanReadableWay(results.better),
);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why logging again the results of the first attempt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you run the tests in the console, we output a lot of intermediary logs indicating current progress (and current results) because the tests take a lot of time.

At that point the first iteration's results would be lost far in the terminal scrollback buffer, so we just repeat it because we're nice guys.

staticServer = launchStaticServer(currentDirectory, {
httpPort: PERF_TESTS_PORT,
});
resultServer = createResultServer(onFinished, onError);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createResultServer could also take port in argument, and the port can be defined as a constant similarly with CONTENT_SERVER_PORT or PERF_TESTS_PORT

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but then we would need to communicate that port to the webpage (we could do it through the URL like through the fragment or a query string) so the lib would know where to send back the results and I was too lazy.

At least hardcoded you know you have to check everywhere it is used :p

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try something

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added both port and retry attempt to the page's fragment now

Copy link

Automated performance checks have been performed on commit b75076e4b7f12b3ad60400d151b906f4235dc500 with the base branch dev.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 19.98ms -> 19.90ms (0.084ms, z: 3.58443) 27.60ms -> 27.45ms
seeking 18.81ms -> 12.25ms (6.551ms, z: 1.22956) 11.40ms -> 11.55ms
audio-track-reload 26.39ms -> 26.44ms (-0.046ms, z: 0.36668) 38.70ms -> 38.70ms
cold loading multithread 48.52ms -> 48.03ms (0.490ms, z: 9.43020) 71.25ms -> 70.05ms
seeking multithread 15.36ms -> 15.38ms (-0.029ms, z: 0.31255) 10.50ms -> 10.50ms
audio-track-reload multithread 26.22ms -> 26.12ms (0.101ms, z: 2.32838) 38.55ms -> 38.40ms
hot loading multithread 15.65ms -> 15.57ms (0.078ms, z: 3.17081) 22.65ms -> 22.35ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants