Benchmark of a lot of regression models
I have taken dataset from here: https://www.kaggle.com/elikplim/eergy-efficiency-dataset
Step 1- there were no column name in data and I added the names ("Relative Compactness","Surface Area","Wall Area", "Roof Area", "Overall Height","Orientation","Glazing Area", "Glazing Area Distribution", "Heating Load", "Cooling Load")
Step 2- I have check NaN values and result is in image
Step 3- Splitin dataframe with test_size=0.33
Step 4- Trying linear regression, polynomial regression, Desicion Tree Regressor, Random Forest Regressor, Gaussian Process Regressor, PLS Regression and XGboost.
Step 5-Comparing metrics which are R2 and mse (Mean Squared Error)
Result are in image
XGboost giving the best result for data and Gaussian result was unexpected for me