README.Rmd

---
output:
  github_document
---
# biotic
[![Build Status](https://travis-ci.org/robbriers/biotic.svg?branch=master)](https://travis-ci.org/robbriers/biotic) [![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/robbriers/biotic?branch=master&svg=true)](https://ci.appveyor.com/project/robbriers/biotic/branch/master) [![DOI](https://zenodo.org/badge/21485/robbriers/biotic.svg)](https://zenodo.org/badge/latestdoi/21485/robbriers/biotic) [![CRAN Version](http://www.r-pkg.org/badges/version/biotic)](http://www.r-pkg.org/pkg/biotic) [![Coverage Status](https://coveralls.io/repos/github/robbriers/biotic/badge.svg?branch=master)](https://coveralls.io/github/robbriers/biotic?branch=master) [![](http://cranlogs.r-pkg.org/badges/biotic)](http://cran.rstudio.com/web/packages/biotic/index.html) [![](https://cranlogs.r-pkg.org/badges/grand-total/biotic)](http://cran.rstudio.com/web/packages/biotic/index.html)

### Introduction
The biotic package calculates a range of commonly used UK freshwater invertebrate biotic indices, based on family level identification.  Indices that can be calculated are BMWP (and associated N-Taxa and ASPT), Whalley revised BWMP, habitat-specific BMWP, WHPT (presence-only and abundance-weighted), LIFE, PSI and AWIC. Links to the definitions of these indices are given at the end of the text.

### Installation
The current release (0.1.2) of the package can be installed from CRAN via the normal means.
```{r, eval=FALSE}
install.packages("biotic")
```
The current development version (0.1.3) can be obtained from GitHub. The homepage of the package is [/~https://github.com/robbriers/biotic](/~https://github.com/robbriers/biotic). In order to install the development version, you need to first install the 'devtools' package.
```{r, eval=FALSE}
install.packages("devtools")
```
Then use the 'install_github' function within this package to load the development version from GitHub.
```{r, eval=FALSE}
devtools::install_github('robbriers/biotic')
```

### Data required
The package assumes that the data are derived from standard three-minute kick sampling (with additional one minute hand search) as per [standard RIVPACS/RICT methods](http://www.eu-star.at/pdf/RivpacsMacroinvertebrateSamplingProtocol.pdf). Identification should be taken to family level. BMWP composite families are combined automatically, as well as oligochaete families combined to Class level where required.

### Input format
The data are assumed to be stored in a dataframe with the first column containing a list of taxon names and subsequent columns containing sample headers and individual taxon abundances. Absences should be recorded as 'NA' rather than zero (so NA or blank cells in a csv file imported via read.csv), although the calculations will work with zero abundances as well, but this is not recommended. In addition to actual abundances, log categories (1-5) or alphabetic log categories (A-E) can also be processed, through specification of the data type (see below). 

If you have data that are stored in a 'tidy' format [sensu Wickham](https://www.jstatsoft.org/article/view/v059i10), then these can be converted to the correct format for this package using the [reshape2](https://cran.r-project.org/package=reshape2) package, or the [tidyr](https://cran.r-project.org/package=tidyr) package prior to calculation.

An example of the default layout of data can be seen by accessing the built-in 'almond' dataset.

```{r}
# load library
library(biotic)

# show the format of the built-in almond dataset
head(almond)
```

### Functions
The workhorse function of the package is 'calcindex'. The allows the user to specify which index is to be calculated as well as the type of data being processed through the 'index' and 'type' arguments. The first relates to the choice of index.  Possible values for index are: "BMWP", "Whalley", "Riffle", "Pool", "RiffPool", "WHPT", "WHPT_AB", "LIFE", and "PSI".  The second argument relates to the format of the dataset.The options are "num" for numeric abundances, "log" for integer log categories (1-5) or "alpha" for alphabetic log abundance categories (A-E).  If the data are in the default format (actual integer abundances) then it can be omitted. The use of these is best illustrated through examples using the built-in datasets ('almond', 'braidburn' and 'greenburn').

```{r}
# calculate the BMWP index for the River Almond dataset
# 'index' and 'type' do not have to specified as defaults are used
# ("BMWP" and "num")

calcindex(almond)

# calculate the PSI index for the almond samples
# 'type' argument again not needed as the data are numeric abundances
calcindex(almond, "PSI")
```

To process data in either integer or alphabetic log categories, the 'type' argument should specify either "log" or "alpha".  Examples are shown below.

```{r}
# example of processing data in alphabetic log abundance categories
# using the 'type' argument

# 'braidburn' dataset contains alphabetic log category data
# see ?braidburn for details

# calculate the Whalley revised BMWP index (including N-taxa and ASPT)

calcindex(braidburn, "Whalley", "alpha")

# example of processing data in numeric log abundance categories
# using the 'type' argument

# 'greenburn' dataset contains numeric log category data
# see ?greenburn for details

# calculate the LIFE index for this dataset

calcindex(greenburn, "LIFE", "log")
```
### Individual index functions
In order to make calculating individual indices more straightforward, wrapper functions are also provided for each index. The specification of each index follows the string provided in the calcindex function (see above).  Again these can process data in either integer abundance, integer log category or alphabetic log category format, through the type argument, which defaults to "num" if not specified as per the main function. Full details of these are provided in the help file (e.g ?calcBMWP), but examples are given below.
```{r}
# calculate the BMWP index for almond samples

calcBMWP(almond)

# calculate the AWIC index for almond samples

calcAWIC(almond)

# calculate the WHPT abundance-weighted index for Green Burn samples
# (numeric log abundance categories)

calcWHPT_AB(greenburn, "log")

# calculate LIFE index for Braid Burn samples (alphabetic log categories)

calcLIFE(braidburn, type="alpha")
```
### Reporting problems
The package has been tested fairly extensively using different test datasets, but if you come across an error or bug, then please submit an [issue](/~https://github.com/robbriers/biotic/issues) or [email me](mailto:r.briers@napier.ac.uk), giving as much information as possible, and ideally the dataset that caused the problem.

### Sources of information about the indices calculated

BMWP, Whalley Revised BMWP and Habitat-specific scores were taken from the table given [here](http://www.cies.staffs.ac.uk/bmwptabl.htm).

The derivation and history of the BMWP score is described in [Hawkes, H.A. (1998)](http://dx.doi.org/10.1016/S0043-1354(97)00275-3) Origin and development of the Biological Monitoring Working Party (BMWP) score system. *Water Research* **32**, 964-968. Details of the revised scores are given in [Walley W.J. &  Hawkes H.A. (1997)](http://dx.doi.org/10.1016/S0043-1354(96)00249-7) A computer-based development of the Biological Monitoring Working Party score system incorporating abundance rating, biotope type and indicator value. *Water Research*, **31**, 201-210.

Details of the family level scores for LIFE, WHPT (both versions) and PSI were taken from the [FBA RICT Development report for SEPA](http://www.fba.org.uk/sites/default/files/SEPARICTWorkstream1(WHPTandotherAbundance-WeightedIndices)Final%20Report.pdf).

The original derivation of the LIFE family level score is described in [Extence et al. (1999)](http://dx.doi.org/10.1002/(SICI)1099-1646(199911/12)15:6<545::AID-RRR561>3.0.CO;2-W) River flow indexing using British benthic macroinvertebrates: a framework for setting hydroecological objectives. *Regulated Rivers: Research and Management* **15**, 543-574 and the PSI index in [Extence et al. (2013)](http://dx.doi.org/10.1002/rra.1569) The assessment of fine sediment accumulation in rivers using macro-invertebrate community response. *Regulated Rivers: Research and Management* **29**, 17-55.

There is no published reference for the WHPT index beyond the report given above.

AWIC Family level scores were taken from the [DEFRA commissioned report](https://www.gov.uk/government/publications/development-of-the-acid-water-indicator-community-awic-macroinvertebrate-family-and-species-level).

The original derivation of the AWIC family level score is described in [Davey-Bowker et al. (2005)](http://dx.doi.org/10.1127/0003-9136/2005/0163-0383) The development and testing of a macroinvertebrate biotic index for detecting the impact of acidity on streams. *Archiv fur Hydrobiologie* **163**, 383-403.