Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model vs model2 #12

Open
mirandrom opened this issue Dec 4, 2019 · 2 comments
Open

model vs model2 #12

mirandrom opened this issue Dec 4, 2019 · 2 comments

Comments

@mirandrom
Copy link

Hello! We are trying to reproduce your paper for the neurips reproducibility challenge. First off, thanks for taking the time to make a clean codebase and providing a clear way to reproduce your experiments!

However, there are a few things we are uncertain about and wanted to clarify.
It seems like the methodology described in the paper uses features found only in model2.py (e.g. GRU on the intermediate encoder state). However, all your main.py scripts use model.py, and some of the code in model2.py seems incomplete, with various lines commented out leading to undeclared variables.

e.g. from method/mymodel-amazon/model2.py in class EncoderDecoder (line 287):

    def forward(self, src, tgt, src_mask, tgt_mask):
        """
        Take in and process masked src and target sequences.
        """
        memory = self.encode(src, src_mask)  # (batch_size, max_src_seq, d_model)
        # attented_mem=self.attention(memory,memory,memory,src_mask)
        # memory=attented_mem
        score = self.attention(memory, memory, src_mask)
        attent_memory = score.bmm(memory)
        # memory=self.linear(torch.cat([memory,attent_memory],dim=-1))

        memory, _ = self.gru(attented_mem)
        '''
        score=torch.sigmoid(self.linear(memory))
        memory=memory*score
        '''
        latent = torch.sum(memory, dim=1)  # (batch_size, d_model)
        logit = self.decode(latent.unsqueeze(1), tgt, tgt_mask)  # (batch_size, max_tgt_seq, d_model)
        # logit,_=self.gru_decoder(logit)
        prob = self.generator(logit)  # (batch_size, max_seq, vocab_size)
        return latent, prob

Can you please advise us on the best way to reproduce your experiments? Should we run the code as is with model.py; or should we update the code to use model2.py (if so, what modifications need to be made?).

Thank you!

@Nrgeup
Copy link
Owner

Nrgeup commented Dec 12, 2019

Sorry to reply to you so late, you can use model.py to reproduce the result. We will update model2.py, and the results are not much improved compared to model.py.

@Diego999
Copy link

@Nrgeup what about results in https://arxiv.org/pdf/1905.12926.pdf ? In the paper you clearly mention GRU. Results are reproducible without ?

Thank you for your answer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants