GitHub - suamin/CAL: Continuous Active Learning (CAL) for Predictive Coding: A Demonstration with Enron Spam-Ham Classification.

Active Learning - A Case Study with Enron Spam-Ham

In this study we compare and contrast active learning with supervised learning on a toy dataset of Enron spam classification under the framework of predictive coding, where the goal is to find relevant documents in large pool of unlabeled and mostly irrelevant documents by learning from human-in-the-loop acting as an oracle. The case study is followed by literature review of major active learning techniques (check notebook).

References

Query strtegies used here are a clone of libact.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
al		al
imgs		imgs
scripts		scripts
.gitignore		.gitignore
Active Learning and Predictive Coding - A Case Study.html		Active Learning and Predictive Coding - A Case Study.html
Active Learning and Predictive Coding - A Case Study.ipynb		Active Learning and Predictive Coding - A Case Study.ipynb
Active Learning and Predictive Coding - A Case Study.pdf		Active Learning and Predictive Coding - A Case Study.pdf
README.md		README.md
__init__.py		__init__.py
corpus.py		corpus.py
data.tar.gz		data.tar.gz
demo_enron_spam.py		demo_enron_spam.py
features.py		features.py
ldds.py		ldds.py
metrics.py		metrics.py
models.py		models.py
plots.py		plots.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
samples.py		samples.py
simulation.py		simulation.py
stopwords.py		stopwords.py
utils.py		utils.py
vocab.py		vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Active Learning - A Case Study with Enron Spam-Ham

References

About

Languages

suamin/CAL

Folders and files

Latest commit

History

Repository files navigation

Active Learning - A Case Study with Enron Spam-Ham

References

About

Topics

Resources

Stars

Watchers

Forks

Languages