MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
-
Updated
Jun 4, 2024 - Python
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015
Near-duplicate image detection using Locality Sensitive Hashing
Implementation of vector quantization algorithms, codes for Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search.
Locality Sensitive Hashing, fuzzy-hash, min-hash, simhash, aHash, pHash, dHash。基于 Hash值的图片相似度、文本相似度
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Serverless, lightweight, and fast vector database on top of DynamoDB
Locality Sensitive Hashing for semantic similarity (Python 3.x)
Fast fuzzy text search
Hashing-based network discovery from time series
cross-architecture binary comparison database
An implementation of Locality sensitive hashing
Select a set of pairs to check in link prediction from the vast, sparse space of possible pairs
Recommending movies using Collaborative Filtering and Locality Sensitive Hashing in PySpark
Quora Questions Pair detection with Semantic Similarity
Hashing-based network alignment based on structural features
First story detection using shingling, LSH and graphical methods
Locally Sensitive Hashing based embedding for High Dimensional Multivariate Time Series
Add a description, image, and links to the lsh topic page so that developers can more easily learn about it.
To associate your repository with the lsh topic, visit your repo's landing page and select "manage topics."