This program builds a model to predict how much a house sells for with PySpark.
The dataset we have is a sample of homes that were sold in St Paul, MN area over the course of 2017. Using this sample, we are to provide a quick proof of concept of whether it is worth investing in more data for the 5.5 milion homes that were sold in the US In 2017.