-
Notifications
You must be signed in to change notification settings - Fork 17
Positional Index vs Inverted Index
Le Thu Nguyen edited this page Apr 19, 2018
·
6 revisions
•Each term in the vocabulary, store postings of the form docID (position1, position2,..) •Store each term in the form of
<number of docs containing term;
Doc1: position1, position2…;
Doc2: position1, position2…;
etc.>
To, 993:
(1,6:(7,18,33,72,..))
The word to has a document frequency 993 and occurs 6 times in document 1 at positions 7,18,33, 72 and etc.
- Phrase queries, use a merge algorithm recursively at the document level.
Fast indexing
Less efficient query's
- A data structure that maps terms back to the parts of a document in which they occur
- For each term T, store a list of all documents containing term T. It is a list of all unique word that appears in any document.
Document 1: The quick brown fox jumped
Document 2: Quick brown foxes
Term Doc1 Doc2
The x
quick x
brown x x
fox x
jumped x
Quick x
foxes x
Fast query
Slower indexing