Skip to content

ridhachahed/-Applied-ML-and-Scaling-Up

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

02 - Applied ML and Scaling Up

In this homework, you will work with two popular frameworks: (1) sklearn (short for scikit-learn) and (2) PySpark (the Python API for Apache Spark). In addition, you will create some basic visualizations to explain your findings. Applied ML and Scaling up constitute the two quintessential skills for a data scientist, thereby serving as the perfect material to prepare you for the real world.

The homework consists of two tasks which are described in the hw2.ipynb notebook.

For each task, please provide both a written explanation of the steps you followed, and the corresponding code. Keep in mind that writing the explanation can help you in two ways:

  1. Clarifying the steps in your mind before writing the actual code
  2. Earning you points if the description is correct, regardless of the potential issues in your code

Submission Guidelines

You are expected to solve the homework as a team of four, which you specified in the project registration form. By the homework submission deadline, each team should have a single shared private github repo, containing the Jupyter Notebook with the solution. Please follow the instructions below to create your team repo and start working on the homework:

  1. One team member should follow this legendary link and create a team by adding a prefix final_ to the exact team name specified in the project registration form.
  2. Creation of the team will automatically create a dedicated private repo. At this point the remaining three team-members should follow the same link and join their team. Make sure you are joining the correct team by checking your team-members' github accounts: there might be teams with similar names.
  3. There is no simple automated way to transfer the materials for Homework 2 from the public course repository into your private team repository. To get started, we suggest that you manually pull the homework materials from the course repository to your local machine, copy them into your local team repository, and push to the remote.
  4. Afterwards -- keep collaborating on the homework as a team in your shared private repository!

Deliverables

hw2.ipynb notebook with disclosed output for each cell. Do not submit your data folder.

Deadline: November 20th, 2019 (23:59 hours / 11:59 PM)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published