Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce smaller and more efficient NN architecture #56

Merged
merged 4 commits into from
Feb 2, 2022
Merged

Conversation

vwxyzjn
Copy link
Collaborator

@vwxyzjn vwxyzjn commented Jan 31, 2022

This PR introduces a smaller and more efficient NN architecture. Namely, replace the existing

    def __init__(self, envs, mapsize=16 * 16):
        super(Agent, self).__init__()
        self.mapsize = mapsize
        h, w, c = envs.observation_space.shape
        self.encoder = nn.Sequential(
            Transpose((0, 3, 1, 2)),
            layer_init(nn.Conv2d(c, 32, kernel_size=3, padding=1)),
            nn.MaxPool2d(3, stride=2, padding=1),
            nn.ReLU(),
            layer_init(nn.Conv2d(32, 64, kernel_size=3, padding=1)),
            nn.MaxPool2d(3, stride=2, padding=1),
            nn.ReLU(),
            layer_init(nn.Conv2d(64, 128, kernel_size=3, padding=1)),
            nn.MaxPool2d(3, stride=2, padding=1),
            nn.ReLU(),
            layer_init(nn.Conv2d(128, 256, kernel_size=3, padding=1)),
            nn.MaxPool2d(3, stride=2, padding=1),
        )

        self.actor = nn.Sequential(
            layer_init(nn.ConvTranspose2d(256, 128, 3, stride=2, padding=1, output_padding=1)),
            nn.ReLU(),
            layer_init(nn.ConvTranspose2d(128, 64, 3, stride=2, padding=1, output_padding=1)),
            nn.ReLU(),
            layer_init(nn.ConvTranspose2d(64, 32, 3, stride=2, padding=1, output_padding=1)),
            nn.ReLU(),
            layer_init(nn.ConvTranspose2d(32, 78, 3, stride=2, padding=1, output_padding=1)),
            Transpose((0, 2, 3, 1)),
        )
        self.critic = nn.Sequential(
            nn.Flatten(),
            layer_init(nn.Linear(256, 128)),
            nn.ReLU(),
            layer_init(nn.Linear(128, 1), std=1),
        )
        self.register_buffer("mask_value", torch.tensor(-1e8))

with the following

    def __init__(self, envs, mapsize=16 * 16):
        super(Agent, self).__init__()
        self.mapsize = mapsize
        h, w, c = envs.observation_space.shape
        self.encoder = nn.Sequential(
            Transpose((0, 3, 1, 2)),
            layer_init(nn.Conv2d(c, 32, kernel_size=3, padding=1)),
            nn.MaxPool2d(3, stride=2, padding=1),
            nn.ReLU(),
            layer_init(nn.Conv2d(32, 64, kernel_size=3, padding=1)),
            nn.MaxPool2d(3, stride=2, padding=1),
            nn.ReLU(),
        )

        self.actor = nn.Sequential(
            layer_init(nn.ConvTranspose2d(64, 32, 3, stride=2, padding=1, output_padding=1)),
            nn.ReLU(),
            layer_init(nn.ConvTranspose2d(32, 78, 3, stride=2, padding=1, output_padding=1)),
            Transpose((0, 2, 3, 1)),
        )
        self.critic = nn.Sequential(
            nn.Flatten(),
            layer_init(nn.Linear(64 * 4 * 4, 128)),
            nn.ReLU(),
            layer_init(nn.Linear(128, 1), std=1),
        )
        self.register_buffer("mask_value", torch.tensor(-1e8))

Preliminary experiment shows it can also produce a sota model but only taking about 16 hours (50M steps) and 36 hours in total to wait for all evaluations to finish.

image

@cpuheater

In contrast, the previous SOTA model (using the larger model) gains a bit higher Trueskill which tasks in about 109 hours

image

Given this evidence, this PR makes the code base use the default smaller model to save compute.

@vwxyzjn vwxyzjn requested a review from kachayev January 31, 2022 20:02
@vwxyzjn vwxyzjn mentioned this pull request Jan 31, 2022
Copy link
Contributor

@kachayev kachayev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of sense to me, exactly the architecture I'm using for almost of my experiments.

It would be interesting to see if we can speed up (or stabilize) learning by either using additional reconstruction loss for the encoder or by forcing encoder outputs for subsequent steps to be close to the current one (e.g. InfoNCE). But that is a completely separate topic

@vwxyzjn
Copy link
Collaborator Author

vwxyzjn commented Feb 2, 2022

Awesome. Merging the PR now.

@vwxyzjn vwxyzjn merged commit 5e7be25 into master Feb 2, 2022
@vwxyzjn vwxyzjn deleted the smallnn branch February 2, 2022 04:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants