Pad coreference model input to 5 #2933

darabos · 2019-06-08T10:21:47Z

I've tested it manually with this code that failed before:

In [1]: import allennlp.pretrained as pt                                                                                                                                 
In [2]: c = pt.neural_coreference_resolution_lee_2017()                                                                                                                  
In [3]: c.predict('I am Joe.')                                                                                                                                           
Out[3]: 
{'top_spans': [[0, 0]],
 'antecedent_indices': [0],
 'predicted_antecedents': [-1],
 'document': ['I', 'am', 'Joe', '.'],
 'clusters': []}

It has a 5-wide Conv1D layer.

darabos · 2019-06-08T17:52:36Z

Even with this change I get some errors for even shorter inputs.

For Hello.:

~/allennlp/allennlp/models/coreference_resolution/coref.py in forward(self, text, spans, span_labels, metadata)
    186          top_span_indices, top_span_mention_scores) = self._mention_pruner(span_embeddings,
    187                                                                            span_mask,
--> 188                                                                            num_spans_to_keep)
    189         top_span_mask = top_span_mask.unsqueeze(-1)
    190         # Shape: (batch_size * num_spans_to_keep)

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    487             result = self._slow_forward(*input, **kwargs)
    488         else:
--> 489             result = self.forward(*input, **kwargs)
    490         for hook in self._forward_hooks.values():
    491             hook_result = hook(self, input, result)

~/allennlp/allennlp/modules/pruner.py in forward(self, embeddings, mask, num_items_to_keep)
    100         # indices will be sorted to the end.
    101         # Shape: (batch_size, 1)
--> 102         fill_value, _ = top_indices.max(dim=1)
    103         fill_value = fill_value.unsqueeze(-1)
    104         # Shape: (batch_size, max_num_items_to_keep)

RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity

For x:

~/allennlp/allennlp/modules/pruner.py in forward(self, embeddings, mask, num_items_to_keep)
     84         # Make sure that we don't select any masked items by setting their scores to be very
     85         # negative.  These are logits, typically, so -1e20 should be plenty negative.
---> 86         scores = util.replace_masked_values(scores, mask, -1e20)
     87 
     88         # Shape: (batch_size, max_num_items_to_keep, 1)

~/allennlp/allennlp/nn/util.py in replace_masked_values(tensor, mask, replace_with)
    664     """
    665     if tensor.dim() != mask.dim():
--> 666         raise ConfigurationError("tensor.dim() (%d) != mask.dim() (%d)" % (tensor.dim(), mask.dim()))
    667     return tensor.masked_fill((1 - mask).byte(), replace_with)
    668 

ConfigurationError: 'tensor.dim() (3) != mask.dim() (2)'

I'll try to figure these out...

darabos · 2019-06-08T20:13:54Z

And with an empty string as the input:

~/allennlp/allennlp/predictors/coref.py in _json_to_instance(self, json_dict)
    159         sentences = [[token.text for token in sentence] for sentence in spacy_document.sents]
    160         print(sentences)
--> 161         instance = self._dataset_reader.text_to_instance(sentences)
    162         return instance

~/allennlp/allennlp/data/dataset_readers/coreference_resolution/conll.py in text_to_instance(self, sentences, gold_clusters)
    165             sentence_offset += len(sentence)
    166 
--> 167         span_field = ListField(spans)
    168         metadata_field = MetadataField(metadata)
    169 

~/allennlp/allennlp/data/fields/list_field.py in __init__(self, field_list)
     28         field_class_set = set([field.__class__ for field in field_list])
     29         assert len(field_class_set) == 1, "ListFields must contain a single field type, found " +\
---> 30                                           str(field_class_set)
     31         # Not sure why mypy has a hard time with this type...
     32         self.field_list: List[Field] = field_list

AssertionError: ListFields must contain a single field type, found set()

matt-gardner · 2019-06-10T15:19:54Z

Thanks @darabos! I'm going to merge this when the build passes, as it fixes one known issue, and we can open new issues for the other cases you found. I suspect they're related to #2890.

* Pad coreference model input to 5. It has a 5-wide Conv1D layer. * mypy

Pad coreference model input to 5.

ed6b1ac

It has a 5-wide Conv1D layer.

mypy

a3b9761

matt-gardner approved these changes Jun 10, 2019

View reviewed changes

matt-gardner merged commit 8e180cb into allenai:master Jun 10, 2019

reiyw pushed a commit to reiyw/allennlp that referenced this pull request Nov 12, 2019

Pad coreference model input to 5 (allenai#2933)

1dac6be

* Pad coreference model input to 5. It has a 5-wide Conv1D layer. * mypy

TalSchuster pushed a commit to TalSchuster/allennlp-MultiLang that referenced this pull request Feb 20, 2020

Pad coreference model input to 5 (allenai#2933)

821e729

* Pad coreference model input to 5. It has a 5-wide Conv1D layer. * mypy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pad coreference model input to 5 #2933

Pad coreference model input to 5 #2933

darabos commented Jun 8, 2019

darabos commented Jun 8, 2019

darabos commented Jun 8, 2019

matt-gardner commented Jun 10, 2019

Pad coreference model input to 5 #2933

Pad coreference model input to 5 #2933

Conversation

darabos commented Jun 8, 2019

darabos commented Jun 8, 2019

darabos commented Jun 8, 2019

matt-gardner commented Jun 10, 2019