-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Vector Envs #220
Comments
@cpnota So I worked on this quite a bit, but after almost finishing I figured out that the Gym vector API is completely incompatible with ALL. The problem is what happens to the observation during reset. ALL's parallel experiment (and any sane system) expects this:
Unfortunately, the gym vector API is not sane, so it does this:
Since the environments are interweaved and reset at different times (they autoreset when done), this does not work with ALL's notion of a There are a few options:
Thoughts? |
Oh I see, instead of giving a terminal "observation", it provides the final reward etc. during the first timestep of the previous episode. In their defense, I see why they did it that way (to avoid masking). But that is sort of frustrating. Are 1. and 2. incompatible? For most of the current environments, I'm guessing that it won't make really any difference, but it could be nice to have the option. I would probably go with the minimum viable fix for now. |
No, the two options are perfectly compatible. I realized that after sending the last message. |
Closed by #240 |
No description provided.
The text was updated successfully, but these errors were encountered: