Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wgpu-core] Return submission index for map_async and on_submitted_work_done to track down completion of async callbacks #6360

Merged
merged 24 commits into from
Nov 4, 2024

Conversation

eliemichel
Copy link
Contributor

Problem
wgpu-native is upgrading to latest versions of webgpu-headers (see gfx-rs/wgpu-native#427), and a notable change since the last sync is the introduction of WGPUFuture. Such structure holds a u64, from which we must be able to tell whether the async operation has completed.

A good candidate for the payload of WGPUFuture is the submission index already used in some places (and u64::MAX to mean "a future that is already resolved, probably because submission failed"). The only issue is that submit operations do not necessarily return a submission index.

Description
This PR makes functions buffer_map_async, add_work_done_closure, map_async, queue_on_submitted_work_done return the submission index that can later be used to check that the operation completed.

Testing
This is a WIP, I need feedback on the use of last_successful_submission_index in queue_on_submitted_work_done (is this the right fallback?), feedback about how to check that an operation completed from its submission index, and there is something to discuss about map_async:

  • What I do here is to increment the active_submission_index to make sure the operation is not considered complete too early, but maybe this can have the callback be invoked with some delay if the buffer mapping operation was not the last in queue?
  • The problem is that triage comes later, so when map_async returns, we do not know yet the true submission index of the map operation (it may be lower than the active_submission_index, right? If not then the proposed solution is ok, otherwise should we trigger triage right away? I guess there is a good reason why it is done separately though.

@eliemichel eliemichel requested a review from a team as a code owner October 3, 2024 07:11
@eliemichel eliemichel changed the title Eliemichel/future Return submission index to track down completion of async callbacks Oct 3, 2024
@eliemichel eliemichel marked this pull request as draft October 3, 2024 07:15
@eliemichel eliemichel marked this pull request as ready for review October 5, 2024 10:08
@Wumpf
Copy link
Member

Wumpf commented Oct 20, 2024

I looked a tiny little bit into this and I think the approach with just looking at whatever the active or last successful submission index is not going to work well:
The WebGPU spec formulates the conditions for when a buffer mapping promise should resolve quite well: It's either when the device is lost or:

The [device timeline](https://www.w3.org/TR/webgpu/#device-timeline) becomes informed of the completion of an unspecified [queue timeline](https://www.w3.org/TR/webgpu/#queue-timeline) point:
* after the completion of currently-enqueued operations that use this
* and no later than the completion of all currently-enqueued operations (regardless of whether they use this).

The second bullet points means that the mapping has to be available at the very latest point when all active submissions at the time of calling map_async are done. This means that we can be certain that things are done if you're waiting for last_successful_submission_index (no plus one, not quite following why you think that's necessary, maybe I'm missing something?)
However, as the first bullet point indicates the mapping may succeed much much earlier! Namely if all submissions are that are using the buffer are completed. Imagine you issue 4 submissions of which submission 0 and 2 use the buffer. Then call map_async. It should now finish once submission 2 is done, before submission 3 and not even considering that the next submission index would be 4.
In wgpu this is handled by LifetimeTracker and it should be possible to prod it such to determine the point in time a buffer becomes available, but naturally this will be a bit tricker than what you've attempted so far.

Another high level issue that hasn't been adressed here so far (and wtf why is the CI not tripping on this 😱 ) is that by exposing this as part of the wgpu interface, you'll have to come up with a way to express this in WebGPU. My kneejerk reaction would be to just not to expose it in the first place (wgpu-native is using wgpu-core only, right?). But there has been some interest in this recently, see #6426. But still not sure I understand the need for as of now (see #6426 (comment), that said, the issue & discussion already highlighted at least major docs issues)

@eliemichel
Copy link
Contributor Author

eliemichel commented Oct 22, 2024

Thanks for taking the time to have a look! 🙏

no plus one, not quite following why you think that's necessary, maybe I'm missing something?

You're not missing anything, this +1 was added out of my doubt about how submission indices must be interpreted, I wanted to avoid reading the mapped buffer too early so conservativatively that I end up doing it too late!

I've read the code of the LifetimeTracker, but it is unclear to me how it determines when an async map operation terminates. It seems to be only about submitting things to the queue, rather than reading things back.

By exposing this as part of the wgpu interface, you'll have to come up with a way to express this in WebGPU.

If by "WebGPU" you mean the JavaScript API, this is supposedly exposed as a Promise object. I do not know how Firefox exactly uses wgpu to implement this API, I just modified wgpu to please the compiler as I am working on the wgpu-core (which indeed is all what wgpu-native needs). I replied with more details in #6426 because if I understand correctly this is all very remated!

@Wumpf
Copy link
Member

Wumpf commented Oct 22, 2024

but it is unclear to me how it determines when an async map operation

I'd need to dig in deeper again, but it tracks which buffers are waiting for mapping. triage_submissions is deciding given a submission id which buffer is ready to be mapped. So I believe it should be possible to figure out that id ahead of time.
If not we should document why not and use a different mechanism for the wgpu-native future ids

If by "WebGPU" you mean the JavaScript API, this is supposedly exposed as a Promise object.

No, that's not what I meant: The wgpu crate is an interface to either a host implementing WebGPU or an implementation of WebGPU (confusingly, one of them based on WebGL!). The former is found here, the later (that's what you adjusted so far) is here. Since wgpu is supposed to have a consistent api across both, we make those promises be a callback under all circumstances.
But so far you touched wgpu-core in such a way that compiling with the webgpu backend should fail - which is why I'm a bit aghast that the ci didn't catch that.

@eliemichel
Copy link
Contributor Author

The wgpu crate is an interface to either a host implementing WebGPU or an implementation of WebGPU

Ow, I did not remember that! I get it now. And that explains part of my confusion. I'll try and rework this PR in the light of this, thx!

@eliemichel
Copy link
Contributor Author

Okey I've tried adding a WgpuFuture type in the Context interface, do you think it's heading in the right direction?

@Wumpf
Copy link
Member

Wumpf commented Oct 23, 2024

do you think it's heading in the right direction?

frankly at this point I don't know.
I think we need to look at a possible promise-like type in a more holistic way across the wgpu surface and explore how this can be used in a sane cross platform manner, including a sample demonstrating that.
It's one thing to add more (hopefully 🤞 ) readily available information to wgpu-core and using that then in wgpu-native and another one to change the way readiness of asynchronous events is advertised in the wgpu interface accross native & web. All those considerations should really go to a separate iteration. Let's focus for now just on the native-only problem of whether wgpu-core can advertise/predict the submission index at which a buffer mapping becomes available!
@teoxoy told me he might be able to give a hand as well on that issue - he's been recently dealing with some aspects of queue workload as part of another work stream.

wgpu-core/src/device/queue.rs Outdated Show resolved Hide resolved
wgpu-core/src/resource.rs Outdated Show resolved Hide resolved
wgpu-core/src/device/life.rs Outdated Show resolved Hide resolved
@eliemichel
Copy link
Contributor Author

Thanks for the feedback!

@Wumpf agreed, I removed everything that was about the wgpu API.

@teoxoy I removed triage_mapped and replaced with '0' where appropriate!

Copy link
Member

@teoxoy teoxoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@teoxoy teoxoy changed the title Return submission index to track down completion of async callbacks [wgpu-core] Return submission index for map_async and on_submitted_work_done to track down completion of async callbacks Nov 4, 2024
@teoxoy teoxoy enabled auto-merge (squash) November 4, 2024 14:12
@teoxoy teoxoy merged commit 6a75a73 into gfx-rs:trunk Nov 4, 2024
27 checks passed
@eliemichel eliemichel deleted the eliemichel/future branch November 16, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants