---
title: "SMUFASA Workflow Example"
author: "Jennifer McNichol"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{SMUFASA Workflow Example}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
# Load Packages
```{r message=FALSE, warning=FALSE}
library(QFASA)
library(dplyr)
library(compositions)
```
# Modeling Inputs
Prior to starting make sure that:
- Fatty acid names in all files are the same (contain the exact same numbers/characters and punctuation)
- There are no fatty acids in the prey file that do not appear in the predator file and visa versa
## Fatty Acid Set
- This is the list of FAs to be used in the modelling.
- The simplest alternative is to load a `.csv` file that contains a single column with a header row with the names of thee fatty acids listed below (see example file **"FAset.csv"**).
- A more complicated alternative is to load a `.csv` file with the full set of FAs and then add code to subset the FAs you wish to use from that set -> *this alternative is useful if you are planning to test multiple sets*.
```{r}
data(FAset)
fa.set = as.vector(unlist(FAset))
```
## Matrix of Predator FA Signatures
- The FA signatures in the originating `.csv` file should be proportions summing to 1.
- Each predator signature is a row with the FAs in columns (see example file **"predatorFAs.csv"**).
The FA signatures are subsetted for the chosen FA set (created above) and renormalized during the modelling so there is no need to subset and/or renormalize prior to loading the .csv file or running `p.SMUFASA` BUT make sure that the same FAs appear in the predator and prey files.
- Your predator FA `.csv` file can contain as much tombstone data columns as you like, you must extract the predator FA signatures as separate input in order to run in `p.SMUFASA`. For example: in the code below, the predator `.csv` file ("`predatorFAs.csv`") has 4 tombstone columns (SampleCode, AnimalCode, SampleGroup, Biopsy). Prior to running `p.SMUFASA`, the tombstone (columns 1-4) and FA data (columns 5 onward) are each extracted from the original data frame. The FA data becomes the `predator.matrix` (which is passed to `p.SMUFASA`) and the tombstone data is retained so that it can be recombined with the model output later on.
```{r}
data(predatorFAs)
tombstone.info = predatorFAs[,1:4]
predator.matrix = predatorFAs[,5:(ncol(predatorFAs))]
npredators = nrow(predator.matrix)
```
## Matrix of Prey FA Signatures
- The FA signatures in the `.csv` file should be proportions summing to 1.
- The prey file should contain all of the individual FA signatures of the prey and their lipid contents (where appropriate).
- If you want to only include a subset of prey species, you must extract it prior to input (see code below).
```{r}
data(preyFAs)
prey.matrix = preyFAs[,-c(1,3)]
# Selecting 5 prey species to include
spec.red <-c("capelin", "herring", "mackerel", "pilchard", "sandlance")
spec.red <- sort(spec.red)
prey.red <- prey.matrix %>%
filter(Species %in% spec.red)
```
## Prey Lipid Content
- Mean lipid content by species group is calculated from the full prey file using the species group as a summary variable (see code below).
- **Note**: If no lipid content correction is going to be applied, then a vector of 1s of length equal to the number of species groups is used as the vector instead. I.e. `FC - rep(1,nrow(prey.matrix))`.
- If you've decided on a subset of species, you must extract them from the mean lipid content vector as well.
```{r}
FC = preyFAs[,c(2,3)]
FC = FC %>%
arrange(Species)
FC.vec = tapply(FC$lipid,FC$Species,mean,na.rm=TRUE)
FC.red <- FC.vec[spec.red]
```
# Running SMUFASA
```{r eval=FALSE}
smufasa.est <- p.SMUFASA(predator.matrix, prey.red, FC.red, fa.set)
```
### p.SMUFASA Output
The MUFASA output is a list with 4 components:
- Cal_Estimates
- Diet_Estimates
- nll
- Var_Epsilon
## Calibration Coefficient Estimates
A vector of length equal to the number of FAs used and whose sum is the total number of FAs used. Thos is a set of calibration coefficients common to all predators used.
````{r eval=FALSE}
CalEst <- smufasa.est$Cal_Estimates
````
## Diet Estimates
The diet estimate vector returned by `p.SMUFASA` represents an overall common diet for all predators in `predator.matrix` . **Note**: If you wish to obtain diet estimates for each individual predator see the steps below.
````{r eval=FALSE}
DietEst <- smufasa.est$Diet_Estimates
````
## nll
This is a vector of the negative log likelihood values at each iteration of the optimizer.
````{r eval=FALSE}
nll <- smufasa.est$nll
````
## Var_Epsilon
This is the optimized diagonal values of the variance-covariance matrix of the errors. See reference in help file for details.
```{r eval=FALSE}
VarEps <- smufasa.est$Var_Epsilon
```
# Obtaining Diet Estimates
Once a vector of calibration coefficients is obtained via `p.SMUFASA` you can pass this vector to `p.QFASA` or `p.MUFASA` to obtain individual diet estimates.
## QFASA
```{r eval=FALSE}
Q = p.QFASA(predator.matrix=predator.matrix, prey.matrix=prey.matrix,
cal.mat=CalEst, dist.meas=2, gamma=1, FC=FC.red,
start.val=rep(1,nrow(prey.red)), ext.fa=fa.set)
```
QFASA Diet Estimates:
```{r, eval=FALSE}
DietEst.Q <- Q$'Diet Estimates'
```
*See* `p.QFASA` *documnetation for further details.*
## MUFASA
```{r,eval=FALSE}
M <- p.SMUFASA(predator.matrix, prey.red, cal.est, FC.red, fa.set)
DietEst.M <- M$Diet_Estimates
```
*See* `p.MUFASA` *documnetation for further details.*