Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented a priority executor for bevy_task #2167

Closed
wants to merge 25 commits into from

Conversation

IGreyGooI
Copy link

A Priority Executor is implemented on top of the async_executor::Executor, bevy_task is modified to use PriorityExecutor instead. None of TaskPool API is modified(but should expect an API change). IoTaskPool and AsyncComputeTaskPool across all crates are replaced by ComputeTaskPool for testing purpose.

Summery of behavior of PriorityExecutor:

  1. PriorityExecutor.run() has two states:
    a. Running tasks with Priority::FinishWithinFrame task
    b. Running tasks with Priority::AcrossFrame and Priority::IO tasks
  2. State a will transition to b automatically after all dispatched Priority::FinishWithinFrame tasks are finished.
  3. State b will transition to a after a event is received and the worker thread is not occupied by task.

This is meant to be a solution of #1907, However, more tests are needed to ensure:

  1. Priority Executor execute all of tasks with the highest priority first before starting any lower priority task.
  2. Priority Executor exits runtime properly.

Passed:

  1. cargo run -p ci
  2. cargo test --all-targets --workspace

IGreyGooI added 20 commits May 12, 2021 18:07
…xecutor

# Conflicts:
#	crates/bevy_tasks/Cargo.toml
@IGreyGooI IGreyGooI changed the title Implemented a Priority executor for bevy_task Implemented a priority executor for bevy_task May 14, 2021
@alice-i-cecile alice-i-cecile added core C-Feature A new feature, making something new possible labels May 14, 2021
@NathanSWard
Copy link
Contributor

It would be nice to see benchmarks on the previous implementation vs this new PriorityExecutor.

@NathanSWard
Copy link
Contributor

NathanSWard commented May 14, 2021

Also would love to see some PriorityExecutor specific tests in a #[cfg(test)] module :)

GreyGoo and others added 2 commits May 15, 2021 19:39
Co-authored-by: Nathan Ward <43621845+NathanSWard@users.noreply.github.com>
Co-authored-by: Nathan Ward <43621845+NathanSWard@users.noreply.github.com>
@IGreyGooI
Copy link
Author

IGreyGooI commented May 15, 2021

@NathanSWard Regarding the benchmark
I did some benchmark test. By benching taskpool.scope(). it appears the performance with fewer workers(4, 8workers) has regressed by 5% - 20%, and with larger worker pool(16, 32 workers) the performance is indifferent. I suspect the main difference is that async_executor::Executor::run() uses a Runner with a local cache as well as a variety of methods for finding work, but in PriorityExecutor::run(), it only uses async_executor::Executor.try_tick(), which find a task by popping the global queue.
I will try to replace async_executor::Executor.try_tick() with a Runner to see if the situation is better with fewer worker threads.

the benchmark is uploaded at /~https://github.com/IGreyGooI/bevy_task_pool_branchmark

@IGreyGooI
Copy link
Author

@NathanSWard Regarding the benchmark
I did some benchmark test. By benching taskpool.scope(). it appears the performance with fewer workers(4, 8workers) has regressed by 5% - 20%, and with larger worker pool(16, 32 workers) the performance is indifferent. I suspect the main difference is that async_executor::Executor::run() uses a Runner with a local cache as well as a variety of methods for finding work, but in PriorityExecutor::run(), it only uses async_executor::Executor.try_tick(), which find a task by popping the global queue.
I will try to replace async_executor::Executor.try_tick() with a Runner to see if the situation is better with fewer worker threads.

the benchmark is uploaded at /~https://github.com/IGreyGooI/bevy_task_pool_branchmark

Runner is not exposed async_executor, and no other cheap workaround could be found.
Until we have our own executor, this could still be an improvement regarding performance on the assumption, which is the more threads that could be being utilized by solving #1907 bring more performance than the loss of using async_executor::Executor.try_tick() instead of async_executor::Executor::run().

Further improvement could be done by implementing a modified version of async_executor, or other potential workaround.

@mtsr
Copy link
Contributor

mtsr commented May 15, 2021

You could test the effect by using a patched async_executor with Runner exposed. Just to see if it's worth bringing it up with the author.

@NathanSWard
Copy link
Contributor

Until we have our own executor, this could still be an improvement regarding performance on the assumption, which is the more threads that could be being utilized by solving #1907 bring more performance than the loss of using async_executor::Executor.try_tick() instead of async_executor::Executor::run().

yep that does make sense, I'd would be nice to see some diagnostics that all available cores are being used based on the config (and that it is actually a performance improvement).
However, I don't know if we should just bring in a change that fixes some of the cases but regresses performance in others.
It may be worth creating a separate issue about bevy having it's own executor that directly supports priority scheduling and then pursuing that path instead.
However, I'm also open to other solutions 😄

@IGreyGooI
Copy link
Author

I refer to the benchmark under bevy/benches/bevy_tasks. For benchmarks under overhead_par_iter/threads, which evaluate taskpool.scope() as well, I check my commit against current main, and no significant improvement or regress is noticable(the difference is close to the noise of running same benchmark twice).

Could someone else run cargo bench iter inside bevy/benches with two different commit to confirm it?

GreyGoo and others added 2 commits May 16, 2021 17:50
Co-authored-by: Nathan Ward <43621845+NathanSWard@users.noreply.github.com>
@IGreyGooI
Copy link
Author

Here is the benchmark result from running cd ./benches & cargo bench iter on 73f4a9d
benchmark_bevy_main_73f4a9d.log

@IGreyGooI
Copy link
Author

Here is the benchmark result from running cd ./benches & cargo bench iter on 01b46f3
benchmark_bevy_priority_executor_01b46f3.log, which is compared to benchmark_bevy_main_73f4a9d.log

@mockersf mockersf added the S-Pre-Relicense This PR was made before Bevy added the Apache license. Cannot be merged or used for other work label Jul 16, 2021
@alice-i-cecile alice-i-cecile added A-Tasks Tools for parallel and async work S-Needs-Review and removed A-Core labels Sep 22, 2021
@alice-i-cecile
Copy link
Member

@IGreyGooI, can you comment in #2373? We've since relicensed to MIT + Apache, and I'd like to either ensure that this can be picked up or close it out.

@alice-i-cecile alice-i-cecile added the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label May 16, 2022
@hymm
Copy link
Contributor

hymm commented May 16, 2022

this is potentially superceded by #4740 too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Tasks Tools for parallel and async work C-Feature A new feature, making something new possible S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! S-Pre-Relicense This PR was made before Bevy added the Apache license. Cannot be merged or used for other work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants