In this study we compare and contrast active learning with supervised learning on a toy dataset of Enron spam classification under the framework of predictive coding, where the goal is to find relevant documents in large pool of unlabeled and mostly irrelevant documents by learning from human-in-the-loop acting as an oracle. The case study is followed by literature review of major active learning techniques (check notebook).
Query strtegies used here are a clone of libact.