A script that automatically infers the topics discussed in a collection of documents.
-
Updated
May 29, 2017 - Python
A script that automatically infers the topics discussed in a collection of documents.
Python scripts used to calculate 3 basic similarity measures, suitable for ad hoc information retrieval systems: Levenshtein Edit Distance, Jaccard, and a Term-Document matrix.
Add a description, image, and links to the document-term-matrix topic page so that developers can more easily learn about it.
To associate your repository with the document-term-matrix topic, visit your repo's landing page and select "manage topics."