Skip to content

This repository contains the annotated dataset, the unigrams, bigrams and trigrams referenced in the paper "Unmasking People’s Opinions behind Mask Wearing during COVID-19 Pandemic – a Twitter Stance Analysis" submitted to the Symmetry journal.

License

Notifications You must be signed in to change notification settings

liviucotfas/covid-19-mask-stance-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This repository contains the annotated dataset, the unigrams, bigrams and trigrams referenced in the paper "Unmasking People’s Opinions behind Mask Wearing during COVID-19 Pandemic – a Twitter Stance Analysis" submitted to the Symmetry journal.

The tweets have been collected over a one-year period, bewteen January 9, 2020 and January 8, 2021, as described in our paper. January 9, 2020 is the date when the first tweet to ever mention mask wearing in the context of the COVID-19 pandemic has been published, marking the begining of the ongoing debate surrounding mask wearing.

Note: The tweets have been minimally pre-processed before extracting the n-grams by removing stop words and duplicated white spaces.

Repository structure:

  • dataset: contains a balanced dataset with 9426 tweets annotated in the categories "against" (-1), "neutral" (0) and "in favor" (1);
  • n-grams-daily: daily unigrams, bigrams and trigrams extracted from the cleaned dataset (excludes retweets), sorted by the number of appearances;
  • n-grams-monthly: monthly unigrams, bigrams and trigrams extracted from the cleaned dataset (excludes retweets), sorted by the number of appearances;
  • n-grams-one-year: unigrams, bigrams and trigrams extracted from the cleaned dataset (excludes retweets) for the considered one-year period, sorted by the number of appearances;

Usage

In accordance with the Twitter policy, in the annotated dataset, only the tweet ids have been provided. The tweets can be hydrated using a tool such as Twarc (/~https://github.com/DocNow/twarc) or Hydrator (/~https://github.com/DocNow/hydrator).

About

This repository contains the annotated dataset, the unigrams, bigrams and trigrams referenced in the paper "Unmasking People’s Opinions behind Mask Wearing during COVID-19 Pandemic – a Twitter Stance Analysis" submitted to the Symmetry journal.

Topics

Resources

License

Stars

Watchers

Forks