The Portal Project is a long-term project on desert ecology, for example on the banner-tailed kangaroo rat. Part of the data can be found in the folder data as a tab separated values file. We are going practice our data wrangling skills using these data. We will use the tidyverse packages for this.
We load and display the first few lines of the data. Alternatively, you open the data in excel or a text editor.
library(tidyverse) # loads several packages from the tidyverse "ecosystem"d <-read_tsv('data/portal_project.tsv') # read_tsv() is from the readr packaged |>slice_head(n=20) |># select first 20 rows knitr::kable() |># the kable() function outputs html tables kableExtra::kable_classic(font_size=12) # nicer formatting
record_id
month
day
year
plot_id
species_id
sex
hindfoot_length
weight
genus
species
taxa
1
7
16
1977
2
NL
M
32
NA
Neotoma
albigula
Rodent
2
7
16
1977
3
NL
M
33
NA
Neotoma
albigula
Rodent
3
7
16
1977
2
DM
F
37
NA
Dipodomys
merriami
Rodent
4
7
16
1977
7
DM
M
36
NA
Dipodomys
merriami
Rodent
5
7
16
1977
3
DM
M
35
NA
Dipodomys
merriami
Rodent
6
7
16
1977
1
PF
M
14
NA
Perognathus
flavus
Rodent
7
7
16
1977
2
PE
F
NA
NA
Peromyscus
eremicus
Rodent
8
7
16
1977
1
DM
M
37
NA
Dipodomys
merriami
Rodent
9
7
16
1977
1
DM
F
34
NA
Dipodomys
merriami
Rodent
10
7
16
1977
6
PF
F
20
NA
Perognathus
flavus
Rodent
11
7
16
1977
5
DS
F
53
NA
Dipodomys
spectabilis
Rodent
12
7
16
1977
7
DM
M
38
NA
Dipodomys
merriami
Rodent
13
7
16
1977
3
DM
M
35
NA
Dipodomys
merriami
Rodent
14
7
16
1977
8
DM
NA
NA
NA
Dipodomys
merriami
Rodent
15
7
16
1977
6
DM
F
36
NA
Dipodomys
merriami
Rodent
16
7
16
1977
4
DM
F
36
NA
Dipodomys
merriami
Rodent
17
7
16
1977
3
DS
F
48
NA
Dipodomys
spectabilis
Rodent
18
7
16
1977
2
PP
M
22
NA
Chaetodipus
penicillatus
Rodent
19
7
16
1977
4
PF
NA
NA
NA
Perognathus
flavus
Rodent
20
7
17
1977
11
DS
F
48
NA
Dipodomys
spectabilis
Rodent
Split the table
There’s quite some redundancy in the data. In particular, genus, species and taxa values are repeated many times for the same species_ID. We’ll split the original table into two: species and observations, which are linked via the species identifier species_id.
species <-select(d, species_id, species, genus, taxa) |>distinct() |>mutate(species_name =paste(genus, species)) # add a proper name for the speciesobservations <-select(d, -c(species, genus, taxa))observations |>slice_head(n=10) |> knitr::kable() |> kableExtra::kable_classic(font_size=12, full_width=FALSE)
record_id
month
day
year
plot_id
species_id
sex
hindfoot_length
weight
1
7
16
1977
2
NL
M
32
NA
2
7
16
1977
3
NL
M
33
NA
3
7
16
1977
2
DM
F
37
NA
4
7
16
1977
7
DM
M
36
NA
5
7
16
1977
3
DM
M
35
NA
6
7
16
1977
1
PF
M
14
NA
7
7
16
1977
2
PE
F
NA
NA
8
7
16
1977
1
DM
M
37
NA
9
7
16
1977
1
DM
F
34
NA
10
7
16
1977
6
PF
F
20
NA
species |> knitr::kable() |> kableExtra::kable_classic(font_size=12, full_width=FALSE)
species_id
species
genus
taxa
species_name
NL
albigula
Neotoma
Rodent
Neotoma albigula
DM
merriami
Dipodomys
Rodent
Dipodomys merriami
PF
flavus
Perognathus
Rodent
Perognathus flavus
PE
eremicus
Peromyscus
Rodent
Peromyscus eremicus
DS
spectabilis
Dipodomys
Rodent
Dipodomys spectabilis
PP
penicillatus
Chaetodipus
Rodent
Chaetodipus penicillatus
SH
hispidus
Sigmodon
Rodent
Sigmodon hispidus
OT
torridus
Onychomys
Rodent
Onychomys torridus
DO
ordii
Dipodomys
Rodent
Dipodomys ordii
OX
sp.
Onychomys
Rodent
Onychomys sp.
SS
spilosoma
Spermophilus
Rodent
Spermophilus spilosoma
OL
leucogaster
Onychomys
Rodent
Onychomys leucogaster
RM
megalotis
Reithrodontomys
Rodent
Reithrodontomys megalotis
SA
audubonii
Sylvilagus
Rabbit
Sylvilagus audubonii
PM
maniculatus
Peromyscus
Rodent
Peromyscus maniculatus
AH
harrisi
Ammospermophilus
Rodent
Ammospermophilus harrisi
DX
sp.
Dipodomys
Rodent
Dipodomys sp.
AB
bilineata
Amphispiza
Bird
Amphispiza bilineata
CB
brunneicapillus
Campylorhynchus
Bird
Campylorhynchus brunneicapillus
CM
melanocorys
Calamospiza
Bird
Calamospiza melanocorys
CQ
squamata
Callipepla
Bird
Callipepla squamata
RF
fulvescens
Reithrodontomys
Rodent
Reithrodontomys fulvescens
PC
chlorurus
Pipilo
Bird
Pipilo chlorurus
PG
gramineus
Pooecetes
Bird
Pooecetes gramineus
PH
hispidus
Perognathus
Rodent
Perognathus hispidus
PU
fuscus
Pipilo
Bird
Pipilo fuscus
CV
viridis
Crotalus
Reptile
Crotalus viridis
UR
sp.
Rodent
Rodent
Rodent sp.
UP
sp.
Pipilo
Bird
Pipilo sp.
ZL
leucophrys
Zonotrichia
Bird
Zonotrichia leucophrys
UL
sp.
Lizard
Reptile
Lizard sp.
CS
scutalatus
Crotalus
Reptile
Crotalus scutalatus
SC
clarki
Sceloporus
Reptile
Sceloporus clarki
BA
taylori
Baiomys
Rodent
Baiomys taylori
SF
fulviventer
Sigmodon
Rodent
Sigmodon fulviventer
RO
montanus
Reithrodontomys
Rodent
Reithrodontomys montanus
AS
savannarum
Ammodramus
Bird
Ammodramus savannarum
SO
ochrognathus
Sigmodon
Rodent
Sigmodon ochrognathus
PI
intermedius
Chaetodipus
Rodent
Chaetodipus intermedius
ST
tereticaudus
Spermophilus
Rodent
Spermophilus tereticaudus
CU
uniparens
Cnemidophorus
Reptile
Cnemidophorus uniparens
SU
undulatus
Sceloporus
Reptile
Sceloporus undulatus
RX
sp.
Reithrodontomys
Rodent
Reithrodontomys sp.
PB
baileyi
Chaetodipus
Rodent
Chaetodipus baileyi
PL
leucopus
Peromyscus
Rodent
Peromyscus leucopus
PX
sp.
Chaetodipus
Rodent
Chaetodipus sp.
CT
tigris
Cnemidophorus
Reptile
Cnemidophorus tigris
US
sp.
Sparrow
Bird
Sparrow sp.
What do we want to know about these data?
Suggestions by audience?
My suggestions + code:
Mean length of hindfoot per species
# overall mean lengthmean_length <-mean(observations$hindfoot_length, na.rm=TRUE) # have to remove NA's
# Nicer formattingmean_lenght_ps |>select(species_name, mean_hf_length) |># only display name and hindfoot lengthfilter(is.finite(mean_hf_length)) |># only display numbers for observed species knitr::kable() |> kableExtra::kable_classic(font_size=12, full_width=FALSE)