{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ba251f66",
   "metadata": {
    "papermill": {
     "duration": 0.011825,
     "end_time": "2021-12-14T18:23:27.026491",
     "exception": false,
     "start_time": "2021-12-14T18:23:27.014666",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "In this notebook, we're going to work with dates.\n",
    "\n",
    "Let's get started!"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b174b798",
   "metadata": {
    "papermill": {
     "duration": 0.009886,
     "end_time": "2021-12-14T18:23:27.046876",
     "exception": false,
     "start_time": "2021-12-14T18:23:27.036990",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Get our environment set up\n",
    "\n",
    "The first thing we'll need to do is load in the libraries and dataset we'll be using. We'll be working with a dataset that contains information on landslides that occured between 2007 and 2016.  In the [**following exercise**](https://www.kaggle.com/kernels/fork/10824403), you'll apply your new skills to a dataset of worldwide earthquakes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e77eb61e",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:27.072817Z",
     "iopub.status.busy": "2021-12-14T18:23:27.071220Z",
     "iopub.status.idle": "2021-12-14T18:23:28.096759Z",
     "shell.execute_reply": "2021-12-14T18:23:28.096113Z"
    },
    "papermill": {
     "duration": 1.039951,
     "end_time": "2021-12-14T18:23:28.096920",
     "exception": false,
     "start_time": "2021-12-14T18:23:27.056969",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# modules we'll use\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import seaborn as sns\n",
    "import datetime\n",
    "\n",
    "# read in our data\n",
    "landslides = pd.read_csv(\"../input/landslide-events/catalog.csv\")\n",
    "\n",
    "# set seed for reproducibility\n",
    "np.random.seed(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2c6d76e",
   "metadata": {
    "papermill": {
     "duration": 0.010572,
     "end_time": "2021-12-14T18:23:28.117984",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.107412",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "Now we're ready to look at some dates!"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "79463dfa",
   "metadata": {
    "papermill": {
     "duration": 0.010058,
     "end_time": "2021-12-14T18:23:28.138449",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.128391",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Check the data type of our date column\n",
    "\n",
    "We begin by taking a look at the first five rows of the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "5f70f1ea",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:28.161611Z",
     "iopub.status.busy": "2021-12-14T18:23:28.161074Z",
     "iopub.status.idle": "2021-12-14T18:23:28.192867Z",
     "shell.execute_reply": "2021-12-14T18:23:28.193320Z"
    },
    "papermill": {
     "duration": 0.044865,
     "end_time": "2021-12-14T18:23:28.193493",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.148628",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>date</th>\n",
       "      <th>time</th>\n",
       "      <th>continent_code</th>\n",
       "      <th>country_name</th>\n",
       "      <th>country_code</th>\n",
       "      <th>state/province</th>\n",
       "      <th>population</th>\n",
       "      <th>city/town</th>\n",
       "      <th>distance</th>\n",
       "      <th>...</th>\n",
       "      <th>geolocation</th>\n",
       "      <th>hazard_type</th>\n",
       "      <th>landslide_type</th>\n",
       "      <th>landslide_size</th>\n",
       "      <th>trigger</th>\n",
       "      <th>storm_name</th>\n",
       "      <th>injuries</th>\n",
       "      <th>fatalities</th>\n",
       "      <th>source_name</th>\n",
       "      <th>source_link</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34</td>\n",
       "      <td>3/2/07</td>\n",
       "      <td>Night</td>\n",
       "      <td>NaN</td>\n",
       "      <td>United States</td>\n",
       "      <td>US</td>\n",
       "      <td>Virginia</td>\n",
       "      <td>16000</td>\n",
       "      <td>Cherry Hill</td>\n",
       "      <td>3.40765</td>\n",
       "      <td>...</td>\n",
       "      <td>(38.600900000000003, -77.268199999999993)</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Small</td>\n",
       "      <td>Rain</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NBC 4 news</td>\n",
       "      <td>http://www.nbc4.com/news/11186871/detail.html</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>42</td>\n",
       "      <td>3/22/07</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>United States</td>\n",
       "      <td>US</td>\n",
       "      <td>Ohio</td>\n",
       "      <td>17288</td>\n",
       "      <td>New Philadelphia</td>\n",
       "      <td>3.33522</td>\n",
       "      <td>...</td>\n",
       "      <td>(40.517499999999998, -81.430499999999995)</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Small</td>\n",
       "      <td>Rain</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Canton Rep.com</td>\n",
       "      <td>http://www.cantonrep.com/index.php?ID=345054&amp;C...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>56</td>\n",
       "      <td>4/6/07</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>United States</td>\n",
       "      <td>US</td>\n",
       "      <td>Pennsylvania</td>\n",
       "      <td>15930</td>\n",
       "      <td>Wilkinsburg</td>\n",
       "      <td>2.91977</td>\n",
       "      <td>...</td>\n",
       "      <td>(40.4377, -79.915999999999997)</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Small</td>\n",
       "      <td>Rain</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>The Pittsburgh Channel.com</td>\n",
       "      <td>https://web.archive.org/web/20080423132842/htt...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>59</td>\n",
       "      <td>4/14/07</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Canada</td>\n",
       "      <td>CA</td>\n",
       "      <td>Quebec</td>\n",
       "      <td>42786</td>\n",
       "      <td>Châteauguay</td>\n",
       "      <td>2.98682</td>\n",
       "      <td>...</td>\n",
       "      <td>(45.322600000000001, -73.777100000000004)</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Riverbank collapse</td>\n",
       "      <td>Small</td>\n",
       "      <td>Rain</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Le Soleil</td>\n",
       "      <td>http://www.hebdos.net/lsc/edition162007/articl...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>61</td>\n",
       "      <td>4/15/07</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>United States</td>\n",
       "      <td>US</td>\n",
       "      <td>Kentucky</td>\n",
       "      <td>6903</td>\n",
       "      <td>Pikeville</td>\n",
       "      <td>5.66542</td>\n",
       "      <td>...</td>\n",
       "      <td>(37.432499999999997, -82.493099999999998)</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Landslide</td>\n",
       "      <td>Small</td>\n",
       "      <td>Downpour</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Matthew Crawford (KGS)</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 23 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   id     date   time continent_code   country_name country_code  \\\n",
       "0  34   3/2/07  Night            NaN  United States           US   \n",
       "1  42  3/22/07    NaN            NaN  United States           US   \n",
       "2  56   4/6/07    NaN            NaN  United States           US   \n",
       "3  59  4/14/07    NaN            NaN         Canada           CA   \n",
       "4  61  4/15/07    NaN            NaN  United States           US   \n",
       "\n",
       "  state/province  population         city/town  distance  ...  \\\n",
       "0       Virginia       16000       Cherry Hill   3.40765  ...   \n",
       "1           Ohio       17288  New Philadelphia   3.33522  ...   \n",
       "2   Pennsylvania       15930       Wilkinsburg   2.91977  ...   \n",
       "3         Quebec       42786       Châteauguay   2.98682  ...   \n",
       "4       Kentucky        6903         Pikeville   5.66542  ...   \n",
       "\n",
       "                                 geolocation  hazard_type      landslide_type  \\\n",
       "0  (38.600900000000003, -77.268199999999993)    Landslide           Landslide   \n",
       "1  (40.517499999999998, -81.430499999999995)    Landslide           Landslide   \n",
       "2             (40.4377, -79.915999999999997)    Landslide           Landslide   \n",
       "3  (45.322600000000001, -73.777100000000004)    Landslide  Riverbank collapse   \n",
       "4  (37.432499999999997, -82.493099999999998)    Landslide           Landslide   \n",
       "\n",
       "  landslide_size   trigger storm_name injuries fatalities  \\\n",
       "0          Small      Rain        NaN      NaN        NaN   \n",
       "1          Small      Rain        NaN      NaN        NaN   \n",
       "2          Small      Rain        NaN      NaN        NaN   \n",
       "3          Small      Rain        NaN      NaN        NaN   \n",
       "4          Small  Downpour        NaN      NaN        0.0   \n",
       "\n",
       "                  source_name  \\\n",
       "0                  NBC 4 news   \n",
       "1              Canton Rep.com   \n",
       "2  The Pittsburgh Channel.com   \n",
       "3                   Le Soleil   \n",
       "4      Matthew Crawford (KGS)   \n",
       "\n",
       "                                         source_link  \n",
       "0      http://www.nbc4.com/news/11186871/detail.html  \n",
       "1  http://www.cantonrep.com/index.php?ID=345054&C...  \n",
       "2  https://web.archive.org/web/20080423132842/htt...  \n",
       "3  http://www.hebdos.net/lsc/edition162007/articl...  \n",
       "4                                                NaN  \n",
       "\n",
       "[5 rows x 23 columns]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "landslides.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b125c71",
   "metadata": {
    "papermill": {
     "duration": 0.01081,
     "end_time": "2021-12-14T18:23:28.215621",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.204811",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "We'll be working with the \"date\" column from the `landslides` dataframe. Let's make sure it actually looks like it contains dates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "a1fb2ece",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:28.241344Z",
     "iopub.status.busy": "2021-12-14T18:23:28.240732Z",
     "iopub.status.idle": "2021-12-14T18:23:28.248869Z",
     "shell.execute_reply": "2021-12-14T18:23:28.249294Z"
    },
    "papermill": {
     "duration": 0.02292,
     "end_time": "2021-12-14T18:23:28.249461",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.226541",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0     3/2/07\n",
      "1    3/22/07\n",
      "2     4/6/07\n",
      "3    4/14/07\n",
      "4    4/15/07\n",
      "Name: date, dtype: object\n"
     ]
    }
   ],
   "source": [
    "# print the first few rows of the date column\n",
    "print(landslides['date'].head())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f901759",
   "metadata": {
    "papermill": {
     "duration": 0.01164,
     "end_time": "2021-12-14T18:23:28.272900",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.261260",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "Yep, those are dates! But just because I, a human, can tell that these are dates doesn't mean that Python knows that they're dates. Notice that at the bottom of the output of `head()`, you can see that it says that the data type of this  column is \"object\". \n",
    "\n",
    "> Pandas uses the \"object\" dtype for storing various types of data types, but most often when you see a column with the dtype \"object\" it will have strings in it. \n",
    "\n",
    "If you check the pandas dtype documentation [here](http://pandas.pydata.org/pandas-docs/stable/basics.html#dtypes), you'll notice that there's also a specific `datetime64` dtypes. Because the dtype of our column is `object` rather than `datetime64`, we can tell that Python doesn't know that this column contains dates.\n",
    "\n",
    "We can also look at just the dtype of a column without printing the first few rows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "43138dd5",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:28.301199Z",
     "iopub.status.busy": "2021-12-14T18:23:28.300554Z",
     "iopub.status.idle": "2021-12-14T18:23:28.302901Z",
     "shell.execute_reply": "2021-12-14T18:23:28.303328Z"
    },
    "papermill": {
     "duration": 0.018653,
     "end_time": "2021-12-14T18:23:28.303494",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.284841",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('O')"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# check the data type of our date column\n",
    "landslides['date'].dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f9c49ee",
   "metadata": {
    "papermill": {
     "duration": 0.011298,
     "end_time": "2021-12-14T18:23:28.326309",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.315011",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "You may have to check the [numpy documentation](https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.dtype.kind.html#numpy.dtype.kind) to match the letter code to the dtype of the object. \"O\" is the code for \"object\", so we can see that these two methods give us the same information."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "641e3c3b",
   "metadata": {
    "papermill": {
     "duration": 0.011354,
     "end_time": "2021-12-14T18:23:28.349169",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.337815",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Convert our date columns to datetime\n",
    "\n",
    "Now that we know that our date column isn't being recognized as a date, it's time to convert it so that it *is* recognized as a date. This is called \"parsing dates\" because we're taking in a string and identifying its component parts.\n",
    "\n",
    "We can determine what the format of our dates are with a guide called [\"strftime directive\", which you can find more information on at this link](http://strftime.org/). The basic idea is that you need to point out which parts of the date are where and what punctuation is between them. There are [lots of possible parts of a date](http://strftime.org/), but the most common are `%d` for day, `%m` for month, `%y` for a two-digit year and `%Y` for a four digit year.\n",
    "\n",
    "Some examples:\n",
    "\n",
    " * 1/17/07 has the format \"%m/%d/%y\"\n",
    " * 17-1-2007 has the format \"%d-%m-%Y\"\n",
    " \n",
    "Looking back up at the head of the \"date\" column in the landslides dataset, we can see that it's in the format \"month/day/two-digit year\", so we can use the same syntax as the first example to parse in our dates: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "9496181b",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:28.375484Z",
     "iopub.status.busy": "2021-12-14T18:23:28.374813Z",
     "iopub.status.idle": "2021-12-14T18:23:28.385928Z",
     "shell.execute_reply": "2021-12-14T18:23:28.386453Z"
    },
    "papermill": {
     "duration": 0.025978,
     "end_time": "2021-12-14T18:23:28.386629",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.360651",
     "status": "completed"
    },
    "scrolled": false,
    "tags": []
   },
   "outputs": [],
   "source": [
    "# create a new column, date_parsed, with the parsed dates\n",
    "landslides['date_parsed'] = pd.to_datetime(landslides['date'], format=\"%m/%d/%y\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee0d1dfe",
   "metadata": {
    "papermill": {
     "duration": 0.011971,
     "end_time": "2021-12-14T18:23:28.410601",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.398630",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "Now when I check the first few rows of the new column, I can see that the dtype is `datetime64`. I can also see that my dates have been slightly rearranged so that they fit the default order datetime objects (year-month-day)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "9ec556f4",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:28.437224Z",
     "iopub.status.busy": "2021-12-14T18:23:28.436647Z",
     "iopub.status.idle": "2021-12-14T18:23:28.442115Z",
     "shell.execute_reply": "2021-12-14T18:23:28.442597Z"
    },
    "papermill": {
     "duration": 0.020269,
     "end_time": "2021-12-14T18:23:28.442764",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.422495",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0   2007-03-02\n",
       "1   2007-03-22\n",
       "2   2007-04-06\n",
       "3   2007-04-14\n",
       "4   2007-04-15\n",
       "Name: date_parsed, dtype: datetime64[ns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# print the first few rows\n",
    "landslides['date_parsed'].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "196ea595",
   "metadata": {
    "papermill": {
     "duration": 0.011772,
     "end_time": "2021-12-14T18:23:28.466691",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.454919",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "Now that our dates are parsed correctly, we can interact with them in useful ways.\n",
    "\n",
    "___\n",
    "* **What if I run into an error with multiple date formats?** While we're specifying the date format here, sometimes you'll run into an error when there are multiple date formats in a single column. If that happens, you can have pandas try to infer what the right date format should be. You can do that like so:\n",
    "\n",
    "`landslides['date_parsed'] = pd.to_datetime(landslides['Date'], infer_datetime_format=True)`\n",
    "\n",
    "* **Why don't you always use `infer_datetime_format = True?`** There are two big reasons not to always have pandas guess the time format. The first is that pandas won't always been able to figure out the correct date format, especially if someone has gotten creative with data entry. The second is that it's much slower than specifying the exact format of the dates."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04985566",
   "metadata": {
    "papermill": {
     "duration": 0.011758,
     "end_time": "2021-12-14T18:23:28.490562",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.478804",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Select the day of the month\n",
    "\n",
    "Now that we have a column of parsed dates, we can extract information like the day of the month that a landslide occurred."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "ae8e9316",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:28.517959Z",
     "iopub.status.busy": "2021-12-14T18:23:28.517409Z",
     "iopub.status.idle": "2021-12-14T18:23:28.524344Z",
     "shell.execute_reply": "2021-12-14T18:23:28.523824Z"
    },
    "papermill": {
     "duration": 0.021694,
     "end_time": "2021-12-14T18:23:28.524480",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.502786",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     2.0\n",
       "1    22.0\n",
       "2     6.0\n",
       "3    14.0\n",
       "4    15.0\n",
       "Name: date_parsed, dtype: float64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# get the day of the month from the date_parsed column\n",
    "day_of_month_landslides = landslides['date_parsed'].dt.day\n",
    "day_of_month_landslides.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ffde1895",
   "metadata": {
    "papermill": {
     "duration": 0.012128,
     "end_time": "2021-12-14T18:23:28.549257",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.537129",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "If we tried to get the same information from the original \"date\" column, we would get an error: `AttributeError: Can only use .dt accessor with datetimelike values`.  This is because `dt.day` doesn't know how to deal with a column with the dtype \"object\". Even though our dataframe has dates in it, we have to parse them before we can interact with them in a useful way."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "078d42c0",
   "metadata": {
    "papermill": {
     "duration": 0.011862,
     "end_time": "2021-12-14T18:23:28.573591",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.561729",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Plot the day of the month to check the date parsing\n",
    "\n",
    "One of the biggest dangers in parsing dates is mixing up the months and days. The `to_datetime()` function does have very helpful error messages, but it doesn't hurt to double-check that the days of the month we've extracted make sense. \n",
    "\n",
    "To do this, let's plot a histogram of the days of the month. We expect it to have values between 1 and 31 and, since there's no reason to suppose the landslides are more common on some days of the month than others, a relatively even distribution. (With a dip on 31 because not all months have 31 days.) Let's see if that's the case:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "53665da1",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2021-12-14T18:23:28.601431Z",
     "iopub.status.busy": "2021-12-14T18:23:28.600514Z",
     "iopub.status.idle": "2021-12-14T18:23:28.882377Z",
     "shell.execute_reply": "2021-12-14T18:23:28.882830Z"
    },
    "papermill": {
     "duration": 0.297338,
     "end_time": "2021-12-14T18:23:28.882986",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.585648",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/opt/conda/lib/python3.7/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).\n",
      "  warnings.warn(msg, FutureWarning)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:xlabel='date_parsed'>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAEHCAYAAAC3Ph1GAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8/fFQqAAAACXBIWXMAAAsTAAALEwEAmpwYAAAS8UlEQVR4nO3df4xlZ33f8ffHv2psSNaG6Wpjs12HOCDHLQ6MTAwUURtHhKZ4oxoHh0ZLZHVbiVAITWuHqCKJXMk0PwhSKGgTU29bwHaM3bWIAlgbp0Bbrdn1D/wLsDG249V6dwmxsKFJZPLtH/dZPB3P7JyZuXdmntn3S7q65+fc75ljf/aZ557znFQVkqT+HLfaBUiSlsYAl6ROGeCS1CkDXJI6ZYBLUqdOWMkPe8lLXlJbtmxZyY+UpO7t27fvW1U1NXv5igb4li1b2Lt370p+pCR1L8ljcy23C0WSOmWAS1KnDHBJ6pQBLkmdMsAlqVMGuCR1alCAJ/mVJPcnuS/Jp5KcnOSsJHuSPJzkhiQnTbpYSdJzFgzwJGcA/waYrqpzgeOBtwMfBD5UVT8G/BVwxSQLlST9/4Z2oZwAvCDJCcApwAHgQuCmtn4nsHXs1UmS5rXgnZhVtT/J7wCPA/8X+DywD3iqqp5tmz0BnDHX/km2A9sBNm/ePI6a1YFP7nl80Ha/8Br/m5CWakgXymnAJcBZwI8ApwJvHvoBVbWjqqaranpq6nm38kuSlmjIWChvAr5ZVYcBktwMvA7YkOSE1go/E9g/uTKltcu/NrRahvSBPw78VJJTkgS4CHgAuB24tG2zDdg1mRIlSXNZMMCrag+jLyvvBO5t++wArgTel+Rh4MXAtROsU5I0y6DhZKvqA8AHZi1+BDh/7BVJkgbxTkxJ6pQBLkmdWtEn8kjSYnmVz/xsgUtSpwxwSeqUAS5JnTLAJalTBrgkdcoAl6ROGeCS1CkDXJI6ZYBLUqcMcEnqlAEuSZ0ywCWpUwa4JHVqyEONX57k7hmv7yR5b5LTk9yW5KH2ftpKFCxJGhnySLWvVdV5VXUe8Grge8AtwFXA7qo6G9jd5iVJK2SxXSgXAd+oqseAS4CdbflOYOsY65IkLWCxAf524FNtemNVHWjTTwIbx1aVJGlBg5/Ik+Qk4K3Ar81eV1WVpObZbzuwHWDz5r6emOGTQCStZYtpgf8McGdVHWzzB5NsAmjvh+baqap2VNV0VU1PTU0tr1pJ0g8sJsAv57nuE4BbgW1tehuwa1xFSZIWNijAk5wKXAzcPGPxNcDFSR4C3tTmJUkrZFAfeFV9F3jxrGV/yeiqFEnSKhj8JeZ6M/QLSklaq7yVXpI6ZYBLUqcMcEnqlAEuSZ06Zr/E1LHLO2y1XtgCl6ROGeCS1CkDXJI6ZYBLUqcMcEnqlAEuSZ0ywCWpUwa4JHXKAJekThngktQpA1ySOjX0kWobktyU5KtJHkxyQZLTk9yW5KH2ftqki5UkPWfoYFYfBj5bVZcmOQk4BXg/sLuqrklyFXAVcOWE6pQ0w7gH5HKArz4t2AJP8sPAG4BrAarqb6vqKeASYGfbbCewdTIlSpLmMqQL5SzgMPBfktyV5I/aU+o3VtWBts2TwMZJFSlJer4hXSgnAK8C3l1Ve5J8mFF3yQ9UVSWpuXZOsh3YDrB5s39+6dhlN4XGbUgL/Angiara0+ZvYhToB5NsAmjvh+bauap2VNV0VU1PTU2No2ZJEgMCvKqeBP4iycvboouAB4BbgW1t2TZg10QqlCTNaehVKO8GPtGuQHkE+CVG4X9jkiuAx4DLJlOiJGkugwK8qu4GpudYddFYq5EkDeadmJLUKQNckjo1tA9cK+hYutzsWDpWadxsgUtSpwxwSeqUAS5JnTLAJalTBrgkdcqrUCQdU9bTlU+2wCWpU+uuBT70X1f1xfMqPZ8tcEnqlAEuSZ1ad10okjQOPXzZaQtckjplC1yL4peJkzfO3/GxdL6OpWM9wha4JHVqUAs8yaPA08D3gWerajrJ6cANwBbgUeCyqvqryZQpSZptMV0o/6SqvjVj/ipgd1Vdk+SqNn/lWKuT1KUh3Rk93Om41i2nC+USYGeb3glsXXY1kqTBhgZ4AZ9Psi/J9rZsY1UdaNNPAhvn2jHJ9iR7k+w9fPjwMsuVJB0xtAvl9VW1P8nfB25L8tWZK6uqktRcO1bVDmAHwPT09JzbSJIWb1ALvKr2t/dDwC3A+cDBJJsA2vuhSRUpSXq+BVvgSU4Fjquqp9v0TwO/BdwKbAOuae+7JlnosXiNpyQdzZAulI3ALUmObP/Jqvpski8DNya5AngMuGxyZUqSZlswwKvqEeCVcyz/S+CiSRQlSVqYt9KPQQ+D3mjx7LZ7Pn8na4u30ktSp2yBryBbL5LGyRa4JHXKAJekThngktQpA1ySOmWAS1KnDHBJ6pQBLkmdMsAlqVMGuCR1ygCXpE55K/0xwAfMSuuTLXBJ6pQBLkmdMsAlqVODAzzJ8UnuSvKZNn9Wkj1JHk5yQ5KTJlemJGm2xXyJ+R7gQeCH2vwHgQ9V1fVJPgZcAXx0zPXpKBxfXDq2DWqBJzkT+KfAH7X5ABcCN7VNdgJbJ1CfJGkeQ1vgvw/8e+BFbf7FwFNV9WybfwI4Y64dk2wHtgNs3uylamuVrXmpPwu2wJP8LHCoqvYt5QOqakdVTVfV9NTU1FJ+hCRpDkNa4K8D3prkLcDJjPrAPwxsSHJCa4WfCeyfXJmSpNkWbIFX1a9V1ZlVtQV4O/BnVfUO4Hbg0rbZNmDXxKqUJD3Pcq4DvxJ4X5KHGfWJXzuekiRJQyxqLJSq+nPgz9v0I8D54y9JkjSEd2JKUqcMcEnqlAEuSZ0ywCWpUwa4JHXKAJekThngktQpA1ySOmWAS1KnDHBJ6pQBLkmdMsAlqVMGuCR1ygCXpE4Z4JLUKQNckjplgEtSp4Y8lf7kJHckuSfJ/Ul+sy0/K8meJA8nuSHJSZMvV5J0xJBHqv0NcGFVPZPkROBLSf4UeB/woaq6PsnHgCuAj06wVklacz655/EFt/mF12yeyGcPeSp9VdUzbfbE9irgQuCmtnwnsHUSBUqS5jboocZJjgf2AT8GfAT4BvBUVT3bNnkCOGOefbcD2wE2b57Mv0KS+jOk5aqjG/QlZlV9v6rOA85k9CT6Vwz9gKraUVXTVTU9NTW1tColSc+zqKtQquop4HbgAmBDkiMt+DOB/eMtTZJ0NEOuQplKsqFNvwC4GHiQUZBf2jbbBuyaUI2SpDkM6QPfBOxs/eDHATdW1WeSPABcn+Rq4C7g2gnWKUmaZcEAr6qvAD85x/JHGPWHS5JWgXdiSlKnDHBJ6pQBLkmdMsAlqVMGuCR1ygCXpE4Z4JLUKQNckjplgEtSpwxwSeqUAS5JnTLAJalTBrgkdcoAl6ROGeCS1CkDXJI6ZYBLUqeGPBPzpUluT/JAkvuTvKctPz3JbUkeau+nTb5cSdIRQ1rgzwL/tqrOAX4KeFeSc4CrgN1VdTawu81LklbIggFeVQeq6s42/TSjJ9KfAVwC7Gyb7QS2TqhGSdIcFtUHnmQLowcc7wE2VtWBtupJYOM8+2xPsjfJ3sOHDy+nVknSDIMDPMkLgU8D762q78xcV1UF1Fz7VdWOqpququmpqallFStJes6gAE9yIqPw/kRV3dwWH0yyqa3fBByaTImSpLkMuQolwLXAg1X1ezNW3Qpsa9PbgF3jL0+SNJ8TBmzzOuAXgXuT3N2WvR+4BrgxyRXAY8BlE6lQkjSnBQO8qr4EZJ7VF423HEnSUN6JKUmdMsAlqVMGuCR1ygCXpE4Z4JLUKQNckjplgEtSpwxwSeqUAS5JnTLAJalTBrgkdcoAl6ROGeCS1CkDXJI6ZYBLUqcMcEnqlAEuSZ0a8kzMjyc5lOS+GctOT3Jbkofa+2mTLVOSNNuQFvh1wJtnLbsK2F1VZwO727wkaQUtGOBV9QXg27MWXwLsbNM7ga3jLUuStJCl9oFvrKoDbfpJYON8GybZnmRvkr2HDx9e4sdJkmZb9peYVVVAHWX9jqqarqrpqamp5X6cJKlZaoAfTLIJoL0fGl9JkqQhlhrgtwLb2vQ2YNd4ypEkDTXkMsJPAf8HeHmSJ5JcAVwDXJzkIeBNbV6StIJOWGiDqrp8nlUXjbkWSdIieCemJHXKAJekThngktQpA1ySOmWAS1KnDHBJ6pQBLkmdMsAlqVMGuCR1ygCXpE4Z4JLUKQNckjplgEtSpwxwSeqUAS5JnTLAJalTBrgkdWpZAZ7kzUm+luThJFeNqyhJ0sKWHOBJjgc+AvwMcA5weZJzxlWYJOnoltMCPx94uKoeqaq/Ba4HLhlPWZKkhSz4UOOjOAP4ixnzTwCvmb1Rku3A9jb7TJKvzdrkJcC3llHHWrJejmW9HAd4LGvVejmWQcfxjuV/zj+Ya+FyAnyQqtoB7JhvfZK9VTU96TpWwno5lvVyHOCxrFXr5VhW+ziW04WyH3jpjPkz2zJJ0gpYToB/GTg7yVlJTgLeDtw6nrIkSQtZchdKVT2b5JeBzwHHAx+vqvuX8KPm7V7p0Ho5lvVyHOCxrFXr5VhW9ThSVav5+ZKkJfJOTEnqlAEuSZ1atQBfT7fhJ3k0yb1J7k6yd7XrWYwkH09yKMl9M5adnuS2JA+199NWs8ah5jmW30iyv52bu5O8ZTVrHCLJS5PcnuSBJPcneU9b3t15Ocqx9HheTk5yR5J72rH8Zlt+VpI9LctuaBd1rExNq9EH3m7D/zpwMaMbgL4MXF5VD6x4MWOQ5FFguqq6uzEhyRuAZ4D/WlXntmX/Cfh2VV3T/nE9raquXM06h5jnWH4DeKaqfmc1a1uMJJuATVV1Z5IXAfuArcA76ey8HOVYLqO/8xLg1Kp6JsmJwJeA9wDvA26uquuTfAy4p6o+uhI1rVYL3Nvw14iq+gLw7VmLLwF2tumdjP6HW/PmOZbuVNWBqrqzTT8NPMjozufuzstRjqU7NfJMmz2xvQq4ELipLV/R87JaAT7XbfhdntSmgM8n2deGDujdxqo60KafBDauZjFj8MtJvtK6WNZ8t8NMSbYAPwnsofPzMutYoMPzkuT4JHcDh4DbgG8AT1XVs22TFc0yv8Qcj9dX1asYjcz4rvan/LpQoz62nq81/SjwMuA84ADwu6tazSIkeSHwaeC9VfWdmet6Oy9zHEuX56Wqvl9V5zG68/x84BWrWc9qBfi6ug2/qva390PALYxObM8Otr7LI32Yh1a5niWrqoPtf7q/A/6QTs5N62P9NPCJqrq5Le7yvMx1LL2elyOq6ingduACYEOSIzdFrmiWrVaAr5vb8JOc2r6cIcmpwE8D9x19rzXvVmBbm94G7FrFWpblSOA1P0cH56Z9WXYt8GBV/d6MVd2dl/mOpdPzMpVkQ5t+AaOLMB5kFOSXts1W9Lys2p2Y7bKh3+e52/D/46oUskxJfpRRqxtGQxN8sqdjSfIp4I2MhsU8CHwA+B/AjcBm4DHgsqpa818OznMsb2T0Z3oBjwL/akY/8pqU5PXAF4F7gb9ri9/PqO+4q/NylGO5nP7Oyz9i9CXl8YwavzdW1W+1DLgeOB24C/gXVfU3K1KTt9JLUp/8ElOSOmWAS1KnDHBJ6pQBLkmdMsAlqVMGuCR1ygDXmtaGHf3Vo6zfmuSclaxpHJJsmTnsrbQUBrh6txVY8QCfceu0tGoMcK05SX49ydeTfAl4eVv2L5N8uQ2m/+kkpyR5LfBW4LfbQwFe1l6fbSNDfjHJvIMNJbkuyceS7G2f97Nt+Za2753t9dq2/I1t+a3AA20YhT9pNd2X5Ofbdq9O8j9bDZ+bMX7Jq9u29wDvmugvUceGqvLla828gFczuu36FOCHgIeBXwVePGObq4F3t+nrgEtnrNsNnN2mXwP82VE+6zrgs4waMmczGgr05PbZJ7dtzgb2tuk3At8Fzmrz/xz4wxk/74cZjRH9v4GptuznGQ0VAfAV4A1t+reB+1b79+2r75d/Bmqt+cfALVX1PYDW2gU4N8nVwAbghcDnZu/Yhix9LfDHozGUAPh7C3zejTUaEe+hJI8wGh70m8AfJDkP+D7w4zO2v6Oqvtmm7wV+N8kHgc9U1ReTnAucC9zWajgeONAGQdpQo4dOAPw3RsMPS0tmgKsX1wFbq+qeJO9k1Bqe7ThGg+uft4ifO3swoAJ+hdFgWK9sP/OvZ6z/7g82rPp6klcBbwGuTrKb0cBm91fVBTN/6JFR7KRxsg9ca80XgK1JXtCG6f1nbfmLGLVkTwTeMWP7p9s6avSggG8meRuMhjJN8soFPu9tSY5L8jLgR4GvMeoKOdBa5r/IqBX9PEl+BPheVf13Rl0ir2r7TyW5oG1zYpKfqNH40U+10fmYdQzSkhjgWlNq9PzEG4B7gD9lNHY8wH9gNJzq/wK+OmOX64F/l+SuFsLvAK5oXxTez8LPWn0cuKN91r+uqr8G/jOwrf2MVzCj1T3LPwTuaI/Y+gBwdY2e8Xop8MG2/92MunUAfgn4SNs+z/tp0iI5nKyOWUmuY9R3fdNC20prkS1wSeqUX2Jq3Uvy68DbZi3+46p65yqUI42NXSiS1Cm7UCSpUwa4JHXKAJekThngktSp/wf6oidBFKn33AAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# remove na's\n",
    "day_of_month_landslides = day_of_month_landslides.dropna()\n",
    "\n",
    "# plot the day of the month\n",
    "sns.distplot(day_of_month_landslides, kde=False, bins=31)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53c32600",
   "metadata": {
    "papermill": {
     "duration": 0.013285,
     "end_time": "2021-12-14T18:23:28.910234",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.896949",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "Yep, it looks like we did parse our dates correctly & this graph makes good sense to me.\n",
    "\n",
    "# Your turn\n",
    "\n",
    "Write code to [**parse the dates**](https://www.kaggle.com/kernels/fork/10824403) in a dataset of worldwide earthquakes."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39293a02",
   "metadata": {
    "papermill": {
     "duration": 0.013108,
     "end_time": "2021-12-14T18:23:28.936775",
     "exception": false,
     "start_time": "2021-12-14T18:23:28.923667",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "---\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "*Have questions or comments? Visit the [course discussion forum](https://www.kaggle.com/learn/data-cleaning/discussion) to chat with other learners.*"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.12"
  },
  "papermill": {
   "default_parameters": {},
   "duration": 11.210793,
   "end_time": "2021-12-14T18:23:29.662548",
   "environment_variables": {},
   "exception": null,
   "input_path": "__notebook__.ipynb",
   "output_path": "__notebook__.ipynb",
   "parameters": {},
   "start_time": "2021-12-14T18:23:18.451755",
   "version": "2.3.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}