AdversarialTraining in Distillation OF BERT

In the recent years, transformer-based language learning models like BERT have been one of the most popular architectures used in the research related to natural lan- guage processing. Productionizing these models under constrained resources such as edge computing require smaller versions of BERT like DistilBERT. However, it is a concern that these models have inadequate robustness against adversarial examples and attacks. This paper evaluates the performance of various models built on the ideas of adversarial training and GAN BERT finetuned on SST-2 dataset. Further the experiments in this paper seek to find evidence on whether knowledege distillation preserves robustness in the student models.

Authors :

Vijay Kalmath Department of Data Science vsk2123@columbia.edu

Amrutha Varshini Sundar Department of Data Science as6431@columbia.edu

Sean Chen Department of Computer Science sean.chen@columbia.edu

Google Drive Link to Models : https://drive.google.com/drive/folders/1q1TGTOl6BftZzn1ZidzvN8GKVRbTdrjM?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
bert_models		bert_models
data/SST-2		data/SST-2
.gitignore		.gitignore
Adversarial_Training_in_Distillation_of_BERT.pdf		Adversarial_Training_in_Distillation_of_BERT.pdf
README.md		README.md
Requirements.txt		Requirements.txt
references.txt		references.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdversarialTraining in Distillation OF BERT

Authors :

About

Releases

Packages

Languages

VijayKalmath/AdversarialTraining_in_Distilation_Of_BERT

Folders and files

Latest commit

History

Repository files navigation

AdversarialTraining in Distillation OF BERT

Authors :

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages