Fix LanguageModel API to easily handle bidirectional contextualizers. #2373

brendan-ai2 · 2019-01-16T23:33:47Z

Is your feature request related to a problem? Please describe.
Currently LanguageModel takes a single contextualizer. All our normal Seq2SeqEncoders work here in the unidirectional case. In the bidirectional case, however, our normal Seq2SeqEncoders are broken when they contain more than one layer. See /~https://github.com/allenai/allennlp/blob/master/allennlp/models/language_model.py#L61. Special handling is needed like that contained in the BidirectionalLanguageModelTransformer. See /~https://github.com/allenai/allennlp/blob/master/allennlp/modules/seq2seq_encoders/bidirectional_language_model_transformer.py#L262. This is an issue because a model can be silently broken. Further, a user can't simply specify lstm, gru, etc. in their config and get a working LM in general.

Describe the solution you'd like
Let's provide a wrapper Seq2SeqEncoder, maybe a BidirectionalLanguageModelObjectiveEncoder, that handles the masking and direction changes transparently. It would accept an underlying Seq2SeqEncoder that would be duplicated to prevent parameter sharing or perhaps two entirely separate encoders.

Describe alternatives you've considered
In the short term we could at least check that users aren't using our default encoders with bidirectional=True and num_layers>1. (Annoyingly we'd need a class blacklist as opposed to checking that condition directly as the transformer encoder does the correct thing despite satisfying the condition.)

Additional context
LSTM Encoder from Calypso: /~https://github.com/allenai/calypso/blob/master/calypso/lstm_encoder.py#L22

Note that it's hardwired to be bidirectional and have only one layer.

The text was updated successfully, but these errors were encountered:

brendan-ai2 · 2019-01-16T23:34:03Z

fyi @nelson-liu

matt-gardner · 2019-01-16T23:59:38Z

Quick note: you'd definitely need two Seq2SeqEncoders, otherwise you'd have a single encoder with backward and forward parameters tied, which is almost certainly not what you want.

brendan-ai2 · 2019-01-17T01:03:13Z

@matt-gardner, sorry I was unclear. What I meant was two types of encoders vs one type of encoder. We very definitely need to instantiate two different instances. :)

brendan-ai2 · 2019-01-17T01:06:42Z

Updated description: "It would accept an underlying Seq2SeqEncoder that would be duplicated to prevent parameter sharing or perhaps two entirely separate encoders."

nelson-liu · 2019-01-24T04:48:45Z

Given the discussion in #2414 , it seems like a solution would be to have the user instantiate two forward direction contextualizers. The LM would then use one for the forward direction, and another for the backward direction. with this implementation, we would error when a bidirectional contextualizer is used.

brendan-ai2 · 2019-01-24T21:53:44Z

For the record, @nelson-liu and I chatted offline about this in some detail. His solution is definitely necessary given that contiguous language models need to provide distinct targets for the forward and backward contextualizers. Sounds like a PR is incoming... :)

github-actions · 2020-08-18T16:18:50Z

@nelson-liu this is just a friendly ping to make sure you haven't forgotten about this issue 😜

nelson-liu · 2020-08-18T17:54:52Z

I've got to admit that I've lost almost all context on this issue, and it feels like we've decided to not proceed forward with it—closing this for now, unsure how big of an issue this really is in the current day

brendan-ai2 closed this as completed Jan 17, 2019

brendan-ai2 reopened this Jan 17, 2019

schmmd added the Contributions welcome label Jan 18, 2019

nelson-liu mentioned this issue Jan 24, 2019

[WIP] Language Modeling of Contiguous Text #2414

Closed

6 tasks

brendan-ai2 assigned nelson-liu Jan 24, 2019

nelson-liu mentioned this issue Jan 26, 2019

Decompose LanguageModel contextualizer into forward_ and backward_ contextualizer #2438

Closed

rloganiv mentioned this issue Jun 27, 2019

PassThroughIterator #3015

Merged

nelson-liu closed this as completed Aug 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix LanguageModel API to easily handle bidirectional contextualizers. #2373

Fix LanguageModel API to easily handle bidirectional contextualizers. #2373

brendan-ai2 commented Jan 16, 2019 •

edited

Loading

brendan-ai2 commented Jan 16, 2019

matt-gardner commented Jan 16, 2019

brendan-ai2 commented Jan 17, 2019

brendan-ai2 commented Jan 17, 2019

nelson-liu commented Jan 24, 2019 •

edited

Loading

brendan-ai2 commented Jan 24, 2019

github-actions bot commented Aug 18, 2020

nelson-liu commented Aug 18, 2020

Fix LanguageModel API to easily handle bidirectional contextualizers. #2373

Fix LanguageModel API to easily handle bidirectional contextualizers. #2373

Comments

brendan-ai2 commented Jan 16, 2019 • edited Loading

brendan-ai2 commented Jan 16, 2019

matt-gardner commented Jan 16, 2019

brendan-ai2 commented Jan 17, 2019

brendan-ai2 commented Jan 17, 2019

nelson-liu commented Jan 24, 2019 • edited Loading

brendan-ai2 commented Jan 24, 2019

github-actions bot commented Aug 18, 2020

nelson-liu commented Aug 18, 2020

brendan-ai2 commented Jan 16, 2019 •

edited

Loading

nelson-liu commented Jan 24, 2019 •

edited

Loading