Skip to content

Commit

Permalink
Merge branch 'master' into improved-dehypenisation
Browse files Browse the repository at this point in the history
  • Loading branch information
kermitt2 committed Sep 28, 2019
2 parents bf7e1de + 24e6f0e commit 5f4af22
Show file tree
Hide file tree
Showing 6 changed files with 415 additions and 44 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -69,3 +69,4 @@ grobid-home/models/dictionaries*
grobid-home/models/software*
grobid-home/models/superconductors*
grobid-home/models/values
grobid-home/models/dataseer
2 changes: 1 addition & 1 deletion doc/Consolidation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ In GROBID, we call __consolidation__ the usage of an external bibliographical se

Consolidation has two main interests:

* The consolidation service improves very significantly the retrieval of header information (+.12 to .13 in f-score, e.g. from 74.59 f-score in average for all fields with Ratcliff/Obershelp similarity at 0.95, to 86.62 f-score, using biblio-glutton and GROBID version 0.5.5 for the PMC 1942 dataset, see the [benchmarking documentation](https://grobid.readthedocs.io/en/latest/End-to-end-evaluation/) and [reports](/~https://github.com/kermitt2/grobid/tree/master/grobid-trainer/doc)).
* The consolidation service improves very significantly the retrieval of header information (+.12 to .13 in f-score, e.g. from 74.59 f-score in average for all fields with Ratcliff/Obershelp similarity at 0.95, to 88.89 f-score, using biblio-glutton and GROBID version 0.5.6-SNAPSHOT for the PMC 1942 dataset, see the [benchmarking documentation](https://grobid.readthedocs.io/en/latest/End-to-end-evaluation/) and [reports](/~https://github.com/kermitt2/grobid/tree/master/grobid-trainer/doc)).

* The consolidation service matches the extracted bibliographical references with known publications, and complement the parsed bibliographical references with various metadata, in particular DOI, making possible the creation of a citation graph and to link the extracted references to external services.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -988,7 +988,7 @@ private StringBuilder toTEINote(String noteType,
StringBuilder tei,
Document doc,
GrobidAnalysisConfig config) throws Exception {
List<String> allNotes = new ArrayList<String>();
List<String> allNotes = new ArrayList<>();
for (DocumentPiece docPiece : documentNoteParts) {

List<LayoutToken> noteTokens = doc.getDocumentPieceTokenization(docPiece);
Expand Down
Loading

0 comments on commit 5f4af22

Please sign in to comment.