Project Overview

In this project I:

Cleaned and preprocessed the data from the Health Information National Trends Survey (HINTS), which "regularly collects nationally representative data about the American public’s knowledge of, attitudes toward, and use of cancer- and health-related information."
Tested classification algorithms (Logistic Regression, Random Forest, XGBoost) to see how well they can predict people that hold fatalistic views (or not) regarding cancer.
Used a mix-data type clustering algorithm to segment repondents of the HINTS survey.

More info regarding HINTS: https://hints.cancer.gov/

Provide feedback