-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdata.qmd
157 lines (127 loc) · 3.6 KB
/
data.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
# Data {#sec-data}
```{r}
#| label: setup
#| message: false
#| warning: false
#| include: false
library(tidyverse)
library(lubridate)
library(scales)
library(knitr)
library(kableExtra)
library(colorblindr)
library(downlit)
# _common.R ----
source("_common.R")
# use font
showtext::showtext_auto()
# set theme
ggplot2::theme_set(theme_ggp2g(
base_size = 15))
```
```{r}
#| label: co_box_dev
#| echo: false
#| results: asis
#| eval: true
co_box(
color = "o", look = "minimal",
header = "Caution",
contents = "This section is still being developed. The contents are subject to change.",
fold = FALSE
)
library(dplyr)
```
The data packages used are available to preview below.
```{r}
#| eval: false
#| echo: true
#| code-fold: show
data_pkgs <- c("palmerpenguins",
"fivethirtyeight",
"ggplot2movies",
"babynames")
install.packages(data_pkgs)
```
### `palmerpenguins::penguins`
The majority of the graphs in the manual are built using the `palmerpenguins::penguins` data.
::: column-margin
::: {style="font-size: 0.95em; color: #282b2d;"}
***...so...many...PENGUINS!***
:::
![Artwork by allison_horst](www/lter_penguins.png){fig-align="center" width="50%" height="50%"}
:::
::: {.callout-note collapse="true" icon="false"}
## Expand to view the data in `palmerpenguins::penguins`
::: {style="font-size: 0.85em;"}
```{r}
#| label: paged_penguins
#| eval: true
#| echo: false
rmarkdown::paged_table(palmerpenguins::penguins)
```
:::
:::
Source: </~https://github.com/allisonhorst/palmerpenguins/>
### `fivethirtyeight`
Use the table below to view the datasets in this package.
::: column-margin
![](www/538.png){fig-align="center" width="30%" height="30%"}
:::
::: {.callout-note collapse="true" icon="false"}
## Expand to view the data in the `fivethirtyeight` package
::: {style="font-size: 0.85em;"}
*To view a table of available datasets in the `fivethirtyeight` package, view the `Data Frame Name` and `Article Title` columns in the `datasets_master` table*
:::
::: {style="font-size: 0.80em;"}
```{r}
#| label: fivethirtyeight_pkg
#| eval: true
#| echo: false
#| message: false
#| warning: false
library(fivethirtyeight)
rmarkdown::paged_table(fivethirtyeight::datasets_master |>
select(`Data Frame Name`, `Article Title`))
```
:::
:::
Source: </~https://github.com/fivethirtyeight/data>
### `ggplot2movies::movies`
::: column-margin
![](www/imdb.png){fig-align="center" width="30%" height="30%"}
:::
::: {.callout-note collapse="true" icon="false"}
## Expand to view *a sample* of the data in `ggplot2movies::movies`
::: {style="font-size: 0.85em;"}
```{r}
#| label: paged_ggplot2movies
#| eval: true
#| echo: false
rmarkdown::paged_table(x =
dplyr::slice_sample(ggplot2movies::movies,
n = 1000, replace = FALSE))
```
:::
:::
Source: <https://www.imdb.com/>
### `babynames::babynames`
::: column-margin
![](www/ssa.png){fig-align="center" width="40%" height="40%"}
:::
::: {.callout-note collapse="true" icon="false"}
## Expand to view *a sample* of the data in `babynames::babynames`
::: {style="font-size: 0.85em;"}
```{r}
#| label: paged_babynames
#| eval: true
#| echo: false
rmarkdown::paged_table(x =
dplyr::slice_sample(babynames::babynames,
n = 1000, replace = FALSE))
```
:::
:::
Source: <http://www.ssa.gov/oact/babynames/limits.html>
***Why not manually create the graph datasets with `data.frame()` or `tibble()`/`tribble()`?***
In my opinion, using manually generated data is great for reproducible examples, but they rarely look like data 'caught in the wild.' The data packages above are also well maintained and can be used to provide a variety of examples.