Important Links

Executive Summary

The Six Muskatweeters were interested in exploring whether Elon Musk’s prolific Twitter presence had meaningful impacts on the stock price of his electric car company and brainchild, Tesla. To do this, we scraped Mr. Musk’s Twitter profile for Tesla-related tweets and Yahoo Finance for daily Tesla stock data. We combined our data in Jupyter notebooks and loaded it into PostgreSQL for further analysis.

Extract

The Six Muskatweeters extracted daily Tesla stock (NASDAQ: TSLA) data from Yahoo Finance and combined that data with tweets from Elon Musk’s Twitter profile (@elonmusk). Yahoo Finance and Twitter required custom BeautifulSoup/Splinter web scrapers. After extracting the stock and social media data for the past 5 years (7-28-2015 – 7-28-2020), we were able to begin our data transformation.

Transform

After importing our stock and tweet data into Jupyter notebooks, we began analyzing and cleansing the data for eventual loading into PostgreSQL. We began by loading the stock and tweet data into Pandas Dataframes and excluding extraneous columns from the data. For the project, we required only the Date, High, Low, Open, and Close prices from the Tesla stock data, together with the Date and Tweet fields from the Twitter data. We performed a str.contains(“esla”) to limit our dataset to Tweets specifically related to Tesla. To calculate fluctuations in stock price, we had to strip out non-numeric characters (i.e. “,”). We then cast the High/Low/Open/Close columns as floats and created new columns to store the calculated values. Differences between open and close prices were stored in Open_Close, while differences in high and low prices were stored in High_Low. To merge the stock and tweet DataFrames, we standardized the date field formats in each DataFrame by casting them each as datetime64[ns]. Upon merging the dataframes, we were ready to load our data into PostgreSQL.

Load

Due to the highly structured nature of the data, we decided to load our data into a relational, PostgreSQL database. To prepare our DataFrame for loading, we first converted it to CSV using the Pandas to_csv command. After that, we connected to our local database, created a new database called tesla_final_db, and loaded the data into a new table called tesla using the Pandas to_sql command.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Data Extraction & Transformation Notebooks		Data Extraction & Transformation Notebooks
Output Data		Output Data
WebVisualizations		WebVisualizations
.DS_Store		.DS_Store
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Important Links

Executive Summary

Extract

Transform

Load

About

Releases

Packages

Languages

Umarfoo/Tesla_Stock_ETL_Project

Folders and files

Latest commit

History

Repository files navigation

Important Links

Executive Summary

Extract

Transform

Load

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages