Project Overview

In this project I:

Cleaned and preprocessed the data from the Health Information National Trends Survey (HINTS), which "regularly collects nationally representative data about the American public’s knowledge of, attitudes toward, and use of cancer- and health-related information."
Tested classification algorithms (Logistic Regression, Random Forest, XGBoost) to see how well they can predict people that hold fatalistic views (or not) regarding cancer.
Used a mix-data type clustering algorithm to segment repondents of the HINTS survey.

More info regarding HINTS: https://hints.cancer.gov/

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.ipynb_checkpoints		.ipynb_checkpoints
archive		archive
CleanData.csv		CleanData.csv
ML_Project_Final_Submission-checkpoint.ipynb		ML_Project_Final_Submission-checkpoint.ipynb
README.md		README.md
cluster_df.csv		cluster_df.csv
hints4cycle2_07312020_public.sav		hints4cycle2_07312020_public.sav
subset_data.csv		subset_data.csv

Provide feedback