Feature/cpu replay buffer #187

benblack769 · 2020-11-27T00:32:20Z

This gives the replay buffers a store_device parameter which allows it to store the replay buffer on a different device than it expects input and output to be. The main use case is if you want the replay buffer to be stored in CPU memory while the network and training lives in GPU memory. I measured the performance hit of doing this to be about 10-15% on DQN.

Right now there is no parameter for store_device on the presets, though perhaps there should be.

cpnota

Looks really good overall, this should be very helpful on many machines! The design seems correct. I left a few comments in line, and also have a few high-level comments:

Could you extend the unit tests to test some of this functionality? In particular, state_test.py and replay_buffer_test.py. FrameStack also really should have unit tests, but I won't hold you to that as they were already missing.
It might be useful to add docstrings in certain places in order to explain the usage.
As you commented, for this to be useful, the presets need to be modified. My thought is that a hyperparameter, something like cpu_replay_buffer=False, could be added. One of the recent PRs added the ability to set hyperparameters from the command line.

cpnota · 2020-12-01T15:17:23Z

all/bodies/vision.py

+            self.from_device = value.device
+        if self.to_device is None:
+            self.to_device = device
+        if self.from_device != value.device or self.to_device != device:


I find the usage a little confusing... device only has one valid value if to_device is set, but can be set to anything if to_device is none. Is there any reason not to choose one usage or the other?

Ok, I greatly simplified this logic.

cpnota · 2020-12-01T15:24:57Z

all/core/state.py

@@ -291,7 +304,7 @@ def as_output(self, tensor):
        return tensor.view((*self.shape, *tensor.shape[1:]))

    def apply_mask(self, tensor):
-        return tensor * self.mask.unsqueeze(-1)
+        return tensor * self.mask.unsqueeze(-1)  # pylint: disable=no-member


Don't need the pylint disable because we changed the linter

cpnota · 2020-12-01T15:25:45Z

all/memory/replay_buffer.py

@@ -23,14 +23,17 @@ def update_priorities(self, indexes, td_errors):
 # Adapted from:
 # /~https://github.com/Shmuma/ptan/blob/master/ptan/experience.py
 class ExperienceReplayBuffer(ReplayBuffer):
-    def __init__(self, size, device=torch.device('cpu')):
+    def __init__(self, size, device='cpu', store_device='cpu'):


Perhaps store_device could default to None, in which case store_device would just be set to device?

cpnota · 2020-12-01T15:29:45Z

all/bodies/vision.py

        if isinstance(state, StateArray):
            return state.update('observation', torch.cat(self._frames, dim=1))
        return state.update('observation', torch.cat(self._frames, dim=0))


+class ToCache:


Perhaps it would be good to add a docstring explaining the purpose of this? It probably would not be immediately obvious to somebody reading the code for the first time.

Added a docstring

benblack769 · 2020-12-07T19:11:49Z

Added unit tests to state_test.py and replay_buffer_test.py
Added docstrings to the cache (and changed the name)
Haven't changed the presets yet.

benblack769 · 2020-12-10T15:03:54Z

Thanks

benblack769 added 3 commits November 12, 2020 17:57

added cpu replay buffer code

5c287a2

Merge branch 'develop' into feature/cpu_replay_buffer

2d7ff07

fixed lazystate

7f6aa09

cpnota requested changes Dec 1, 2020

View reviewed changes

added tests and docstrings, changed cpu buffer params

033491e

fix whitespace

52d22f8

cpnota approved these changes Dec 9, 2020

View reviewed changes

cpnota merged commit d2dc3ab into develop Dec 10, 2020

cpnota deleted the feature/cpu_replay_buffer branch April 12, 2022 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/cpu replay buffer #187

Feature/cpu replay buffer #187

benblack769 commented Nov 27, 2020

cpnota left a comment

cpnota Dec 1, 2020

benblack769 Dec 7, 2020

cpnota Dec 1, 2020 •

edited

Loading

benblack769 Dec 7, 2020

cpnota Dec 1, 2020

benblack769 Dec 7, 2020

cpnota Dec 1, 2020

benblack769 Dec 7, 2020

benblack769 commented Dec 7, 2020

benblack769 commented Dec 10, 2020

Feature/cpu replay buffer #187

Feature/cpu replay buffer #187

Conversation

benblack769 commented Nov 27, 2020

cpnota left a comment

Choose a reason for hiding this comment

cpnota Dec 1, 2020

Choose a reason for hiding this comment

benblack769 Dec 7, 2020

Choose a reason for hiding this comment

cpnota Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

benblack769 Dec 7, 2020

Choose a reason for hiding this comment

cpnota Dec 1, 2020

Choose a reason for hiding this comment

benblack769 Dec 7, 2020

Choose a reason for hiding this comment

cpnota Dec 1, 2020

Choose a reason for hiding this comment

benblack769 Dec 7, 2020

Choose a reason for hiding this comment

benblack769 commented Dec 7, 2020

benblack769 commented Dec 10, 2020

cpnota Dec 1, 2020 •

edited

Loading