| Version: | 3.0.0 | 
| Date: | 2022-08-12 | 
| Title: | R Code for Mark Analysis | 
| Author: | Jeff Laake <jefflaake@gmail.com> with code contributions from Eldar Rakhimberdiev, Ben Augustine, Daniel Turek, Brett McClintock, and Jim Hines and example data and analysis from Bret Collier, Jay Rotella, David Pavlacky, Andrew Paul, Luke Eberhart- Phillips, Jake Ivan, and Connor Wood. | 
| Description: | An interface to the software package MARK that constructs input files for MARK and extracts the output. MARK was developed by Gary White and is freely available at http://www.phidot.org/software/mark/downloads/ but is not open source. | 
| SystemRequirements: | notepad.exe, mark.exe (>= 8.0) (or mark32.exe and mark64.exe) and rel_32.exe (see README.txt) | 
| Depends: | R (≥ 2.13.0), | 
| Imports: | parallel, matrixcalc, msm, coda | 
| Suggests: | lattice, splines, nlme, plotrix | 
| LazyLoad: | yes | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| Maintainer: | Jeff Laake <jefflaake@gmail.com> | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.2.1 | 
| NeedsCompilation: | no | 
| Packaged: | 2022-08-13 23:09:13 UTC; jefflaake | 
| Repository: | CRAN | 
| Date/Publication: | 2022-08-13 23:40:02 UTC | 
A beginners introduction and guide to RMark
Description
The RMark package is a collection of R functions that can be used as an interface to MARK for analysis of capture-recapture data. Some initial ideas are described here but you are strongly encouraged to read use the help files with this package and to read the documentation at http://www.phidot.org/software/mark/rmark/RMarkDocumentation.zip The above link appears the first time you enter library(RMark) in an R session.
Details
The RMark package contains various functions that import/export capture data, build capture-recapture models, run the FORTRAN program MARK.EXE, and extract and display output. Program MARK has its own user interface; however, model development can be rather tedious and error-prone because the parameter structure and design matrix are created by hand. This interface in R was created to use the formula and design matrix functions in R to ease model development and reduce errors. This R interface has the following advantages: 1) Uses model notation to create design matrices rather than designing them by hand in MARK or in EXCEL, which makes model development faster and more reliable. All-different PIMS are automatically created for each group (if any). 2) Allows models based on group (factor variables) and individual covariates with groups created on the fly. Age, cohort, group and time variables are pre-defined for use in formulas. 3) Both real and beta labels are automatically added for easy output interpretation. 4) Input, output and specific results (eg parameter estimates, AICc etc) are stored in an R object where they can be manipulated as deemed useful (eg plotting, further calculations, simulation etc). 5) Parameter estimates can be displayed in triangular PIM format (if appropriate) for ease of interpretation. 6) Easy setup of batch jobs and the calls to the R functions document the model specifications and allow models to be easily reproduced or re-run if data are changed. 7) Covariate-specific estimates of real parameters can be computed within R without re-running the analysis.
The MARK capture-recapture models that are currently
supported are provided in MarkModels.pdf which is installed in the RMark directory of your R library.
You can also find a list in MARK under Help/Data Types.
' 
There is one limitation of this interface. All models
in this interface are developed via a design matrix approach rather than
coding the model structure via parameter index matrices (PIMS). In most
cases, a logit or other link is used by default which has implications for
ability of MARK to count the number of identifiable parameters (see
dipper for an example).  However, beginning with v1.7.6 the
sin link is now supported if the formula specifies an identity design matrix
for the parameter.
Before you begin, you must have installed MARK
(http://www.phidot.org/software/mark/) on your computer
or at least have a current copy of MARK.EXE. As long as you selected the
default location for the MARK install (c:/Program Files/Mark), the
RMark library will be able to find it.  If for some reason, you choose
to install it in a different location, see the note section in
mark for instructions on setting the variable MarkPath to
specify the path.  In addition to installing MARK, you must have installed
the RMark library into the R library directory.  Once done with those
tasks, run R and enter library(RMark) (or put it in your .First function) to
attach the library of functions.
The following is a categorical listing of the functions in the package with
a link to the help for each function. To start, read the help for functions
import.chdata and mark to learn how to import
your data and fit a simple model.  The text files for the examples shown in
import.chdata are in the subdirectory data within the R Library
directory in RMark. Next look at the example data sets and analyses
dipper, edwards.eberhardt, and
example.data. After you see the structure of the examples and
the use of functions to fit a series of analyses, explore the remaining
functions under Model Fitting, Batch Analyses, Model Selection and Summary
and Display.  If your data and models contain individual covariates, read
the section on Real Parameter Computation to learn how to compute estimates
of real parameters at various covariate values.
Input/Output data & results
import.chdata,read.mark.binary,
extract.mark.output
Exporting Models to MARK interface
Model Fitting
mark, process.data,
make.design.data, add.design.data,
make.mark.model, run.mark.model
merge_design.covariates
Batch analyses with functions
run.models, collect.models,
create.model.list, mark.wrapper
Summary and display
summary.mark, print.mark,
print.marklist, get.real,
compute.real, print.summary.mark
Model Selection/Goodness of fit
adjust.chat, adjust.parameter.count,
model.table , release.gof,
model.average
Real Parameter computation
find.covariates, fill.covariates,
compute.real , covariate.predictions
Utility and internal functions
mark.wrapper,collect.model.names, compute.design.data,
extract.mark.output, inverse.link,
deriv.inverse.link, setup.model,
setup.parameters, valid.parameters,
cleanup,TransitionMatrix,
For examples, see dipper for CJS and POPAN, see
example.data for CJS with multiple grouping variables, see
edwards.eberhardt for various closed-capture models, see
mstrata for Multistrata, and see Blackduck for
known fate. The latter two are examples of the use of
mark.wrapper for a shortcut approach to creating a series of
models. Other examples have been added for the various other models. In MarkModels.pdf it also
lists the name of examples that are provided for each model.
Author(s)
Jeff Laake
References
MARK: Dr. Gary White, Department of Fishery and Wildlife Biology, Colorado State University, Fort Collins, Colorado, USA http://www.phidot.org/software/mark/
Black duck known fate data
Description
A known fate data set on Black ducks that accompanies MARK as an example analysis using the Known model.
Format
A data frame with 48 observations on the following 5 variables.
- ch
- a character vector containing the encounter history of each bird 
- BirdAge
- the age of the bird: a factor with levels - 0- 1for young and adult
- Weight
- the weight of the bird at time of marking 
- Wing_Len
- the wing-length of the bird at time of marking 
- condix
- the condition index of the bird at time of marking 
Details
This is a data set that accompanies program MARK as an example for Known
fate. The data can be stratified using BirdAge as a grouping variable.  The
function run.Blackduck defined below in the examples creates some of
the models used in the dbf file that accompanies MARK.
Note that in the MARK example the variable is named Age. In the R code, the fields "age" and "Age" have specific meanings in the design data related to time since release. These will override the use of a field with the same name in the individual covariate data, so the names "time", "Time", "cohort", "Cohort", "age", and "Age" should not be used in the individual covariate data with possibly the exception of "cohort" which is not defined for models with "Square" PIMS such as POPAN and other Jolly-Seber type models.
Examples
data(Blackduck)
# Change BirdAge to numeric; starting with version 1.6.3 factor variables are
# no longer allowed.  They can work as in this example but they can be misleading
# and fail if the levels are non-numeric.  The real parameters will remain 
# unchanged but the betas will be different.
Blackduck$BirdAge=as.numeric(Blackduck$BirdAge)-1
run.Blackduck=function()
{
#
# Process data
#
bduck.processed=process.data(Blackduck,model="Known")
#
# Create default design data
#
bduck.ddl=make.design.data(bduck.processed)
#
#  Add occasion specific data min < 0; I have no idea what it is
#
bduck.ddl$S$min=c(4,6,7,7,7,6,5,5)
#
#  Define range of models for S
#
S.dot=list(formula=~1)
S.time=list(formula=~time)
S.min=list(formula=~min)
S.BirdAge=list(formula=~BirdAge)
#
# Note that in the following model in the MARK example, the covariates
# have been standardized.  That means that the beta parameters will be different
# for BirdAge, Weight and their interaction but the likelihood and real parameter
# estimates are the same.
#
S.BirdAgexWeight.min=list(formula=~min+BirdAge*Weight)
S.BirdAge.Weight=list(formula=~BirdAge+Weight)
#
# Create model list and run assortment of models
#
model.list=create.model.list("Known")
bduck.results=mark.wrapper(model.list,data=bduck.processed,ddl=bduck.ddl,
               invisible=FALSE,threads=1,delete=TRUE)
#
# Return model table and list of models
#
return(bduck.results)
}
bduck.results=run.Blackduck()
bduck.results
Burnham Live-Dead Model
Description
An example of the Burnham live-dead model using simulated data LD1.inp from Chapter 9 of Cooch and White
Author(s)
Luke Eberhart-Phillips<luke.eberhart at gmail.com>
Examples
###############################################################################
#### RMARK script for conducting the Burnham model tutorial in Chapter 9.3 ####
#################### the of the Cooch and White MARK book #####################
###############################################################################
##################### Code by: Luke Eberhart-Phillips #########################
###### Dept. Animal Behaviour, Bielefeld University, Bielefeld, Germany #######
##################### email: luke.eberhart at gmail.com ##########################
###############################################################################
# import/convert the simulated "LD1.inp" MARK capture history into an RMARK 
# dataframe, while defining the two groups as "Y" for individuals marked as 
# young, and "A" for individuals marked as adults
# NOTE: the "LD1.inp" file is found in the zipped folder downloaded when you
# click on "Example data files" in the drop-down menu of the MARK book webpage
#  \url{http://www.phidot.org/software/mark/docs/book/}
pathtodata=paste(path.package("RMark"),"extdata",sep="/")
LD=convert.inp(paste(pathtodata,"ld1",sep="/"),
           group.df=data.frame(age_marked=c("Y","A")))
# process the data by defining the model type as "Burnham" and the groups in
# the data.  In this case the only group is the age at which individuals were 
# marked
LD.proc=process.data(data = LD, 
	model = "Burnham",
	groups=c("age_marked"),
	age.var=1,
	initial.age=c(1,0))
# make the design data from the process data above
LD.ddl=make.design.data(LD.proc)
# add the correct binning to the design data so that individuals that were
# marked as young are adults in their second year of life, where as those
# marked as adults are adults for their entire life.
LD.ddl=add.design.data(data = LD.proc,
	ddl = LD.ddl,
	parameter="S", 
	type = "age",
	bins = c(0,1,8),
	right = FALSE,
	name = "age",
	replace = TRUE)
# do the same to the F parameter
LD.ddl=add.design.data(data = LD.proc,
	ddl = LD.ddl,
	parameter="F", 
	type = "age",
	bins = c(0,1,8),
	right = FALSE,
	name = "age",
	replace = TRUE)
# check parameter matrix to see if groups were binned correctly in the S matrix
PIMS(mark(data = LD.proc,
			ddl = LD.ddl,
			model.parameters=list(S=list(formula=~age)),
			delete=TRUE,
			model = "Burnham"),
	"S")
# Create the formulas that describe variation in the parameter we want to test.
# In this case we want to test for an age effect on survival and fidelity,
# while keeping recapture and recovery probabilities constant.
S.age=list(formula=~age) # S(age)
p.dot=list(formula=~1) # p(.)
F.age=list(formula=~age) # F(age)
r.dot=list(formula=~1) # r(.)
# Run the model
LD.model.age.F.S=mark(data = LD.proc,
	ddl = LD.ddl, 
	model.parameters = list(S = S.age, p = p.dot, 
			F =F.age, r = r.dot), 
	invisible = FALSE, 
	model = "Burnham",delete=TRUE)
# Check the parameter estimates, they should be the same as those generated
# when doing the tutorial in chapter 9.3 of the in MARK Book (table on pg 9-8)
LD.model.age.F.S$results$real
# Clean your working directory
cleanup(ask=FALSE)
Density Estimation with Telemetry
Description
The annotated code below is a companion to A Gentle Introduction to Program MARK, Chapter 20: Density Estimation (http://www.phidot.org/software/mark/docs/book/pdf/chap20.pdf). It requires the file "Density.txt", which is the RMark analog to the example Density Estimation input file distributed with MARK. These are simulated data intended to mimic a study of small mammals, such as deer mice, sampled at 2 sites (habitat types), A and B. Each habitat type was sampled with a (10 x 10) live-trapping grid (10m trap spacing). There are 5 occasions. In addition to marking each mouse with an individually identifiable ear tag, 50 percent of the individuals captured were fitted with a small VHF transmitter. These radio-tagged individuals were located once during the day and once at night for 5 days immediately after mark-recapture sampling (n = 10 locations total per animal) and each location was recorded as in or out of the study site. The single covariate we recorded is the distance to the edge (DTE) of the of the site from the mean trap location of each individual (i.e., compute the mean trap location for each individual captured >1 time, then compute the minimum distance from this mean location to the edge of the site).
Format
A data frame with 32 observations of 5 variables
- ch
- a character vector containing the capture history for 5 occasions 
- TotalLocations
- The total number of telemetry locations if telemetered; otherwise a . 
- TotalIn
- total number of locations in the original site if telemetered; otherwise a . 
- Site
- the original site A or B 
- DTE
- distance to the edge of the site when originally caught 
Author(s)
Jake Ivan
Examples
#Read in Density Estimation input file specific to RMark
#Add 2 covariates that will be used for threshhold models - see below & p. 20-14, 20-15
#Specify data type - use "Densitypc" for this example, which is "Density Estimation with
# Huggins p and c". Could also use "DensityRanpc" (Huggins p and c with random effects),
#"DensityHet" (Huggins heterogeniety with pi and p), "DensityFHet" (Huggins full
#heterogeneity with pi and p) and DensityFHet (Huggins Full heterogeniety with pi, p, and c).
#Be sure to specify areas argument in process.data for this model. It will not run if you don't
# give it the area of each study site
data(Density)
#Create variables for threshhold model - see below & p. 20-14, 20-15
Density$Thresh15 <- ifelse(Density$DTE<15, Density$DTE, 15)
#Create variables for threshhold model - see below & p. 20-14, 20-15
Density$Thresh25 <- ifelse(Density$DTE<25, Density$DTE, 25)
data_proc <- process.data(Density, model="Densitypc", groups = c("Site"), areas=rep(0.81,2))
data_ddl <- make.design.data(data_proc)
#Run model p(.)p~(.) from p. 20-9, 20-10. View results.
p_dot <- list(formula = ~1, share=TRUE)
ptilde_dot <- list(formula = ~1)
model1 <- mark(data_proc,data_ddl,model.parameters=list(p=p_dot,ptilde=ptilde_dot),delete=TRUE)
#Run models p(site)p~(.) and p(.)p~(site) as indicated on p. 20-11. View results.
p_site <- list(formula = ~1 + Site, share=TRUE)
ptilde_site <- list(formula = ~1 + Site)
model2 <- mark(data_proc, data_ddl, model.parameters=list(p=p_dot, ptilde=ptilde_site),
                  delete=TRUE)
model3 <- mark(data_proc, data_ddl, model.parameters=list(p=p_site, ptilde=ptilde_dot),
                    delete=TRUE)
#Run model p(DTE)p~(DTE) as indicated on p. 20-12. View results.
p_DTE <- list(formula = ~1 + DTE, share=TRUE)
ptilde_DTE <- list(formula= ~1 + DTE)
model4 <- mark(data_proc, data_ddl, model.parameters = list(p=p_DTE, ptilde=ptilde_DTE),
                  delete=TRUE)
#Compute Model Selection Table that appears on p. 20-12. View results.
ModSelTable <- collect.models(type="Densitypc")
ModSelTable
#Run Threshhold models from p. 20-15.
p_DTE_Thresh15 <- list(formula = ~1 + Thresh15, share=TRUE)
p_DTE_Thresh25 <- list(formula = ~1 + Thresh25, share=TRUE)
model5 <- mark(data_proc, data_ddl, model.parameters = list(p=p_DTE_Thresh15, ptilde=ptilde_DTE)
               ,delete=TRUE)
model6 <- mark(data_proc, data_ddl, model.parameters = list(p=p_DTE_Thresh25, ptilde=ptilde_DTE)
               ,delete=TRUE)
#Re-compute Model Selection Table that appears on p. 20-16
ModSelTable <- collect.models(type="Densitypc")
ModSelTable
Exercise 7 example data
Description
An example occupancy data set used as exercise 7 in the occupancy website developed by Donovan and Hines.
Format
A data frame with 20 observations (sites) on the following 2 variables.
- ch
- a character vector containing the presence (1) and absence (0) for each of 5 visits to the site 
- freq
- frequency of sites (always 1) 
Details
This is a data set from exercise 7 of Donovan and Hines occupancy web site (https://blog.uvm.edu/tdonovan-vtcfwru/occupancy-estimation-and-modeling-ebook/).
Examples
# Donovan.7 can be created with
# Donovan.7=convert.inp("Donovan.7.inp")
do.exercise.7=function()
{
  data(Donovan.7)
# Estimates from following agree with estimates on website but the
# log-likelihood values do not agree.  Maybe a difference in whether the
# constant binomial coefficients are included.
  Donovan.7.poisson=mark(Donovan.7,model="OccupRNPoisson",invisible=FALSE,threads=1,delete=TRUE)
# THe following model was not in exercise 7.
  Donovan.7.negbin=mark(Donovan.7,model="OccupRNNegBin",invisible=FALSE,threads=1,delete=TRUE)
  return(collect.models())
}
exercise.7=do.exercise.7()
# Remove # to see output
# print(exercise.7)
Exercise 8 example data
Description
An example occupancy data set used as exercise 8 in the occupancy website developed by Donovan and Hines.
Format
A data frame with 20 observations (sites) on the following 2 variables.
- ch
- a character vector containing the counts for each of 5 visits to the site 
- freq
- frequency of sites (always 1) 
Details
This is a data set from exercise 8 of Donovan and Hines occupancy web site (https://blog.uvm.edu/tdonovan-vtcfwru/occupancy-estimation-and-modeling-ebook/). In MARK, it uses 2 digits to allow a count of 0 to 99 at each site, so the history has 10 digits for 5 visits (occasions).
Examples
# This example is excluded from testing to reduce package check time
# Donovan.8 can be created with
# Donovan.8=convert.inp("Donovan.8.inp")
do.exercise.8=function()
{
  data(Donovan.8)
# Results agree with the values on the website.
  Donovan.8.poisson=mark(Donovan.8,model="OccupRPoisson",invisible=FALSE,threads=2,delete=TRUE)
# The following model was not in exercise 8. The NegBin model does 
# better if it is initialized with the r and lambda from the poisson.
  Donovan.8.negbin=mark(Donovan.8,model="OccupRNegBin",
    initial=Donovan.8.poisson,invisible=FALSE,threads=2,delete=TRUE)
  return(collect.models())
}
exercise.8=do.exercise.8()
# Remove # to see output
# print(exercise.8)
Example of Immigration-Emigration LogitNormal Mark-Resight model
Description
Data and example illustrating Immigration-Emigration LogitNormal Mark-Resight model
Format
A data frame with 34 observations on the following variable.
- ch
- a character vector 
Examples
# This example is excluded from testing to reduce package check time
data(IELogitNormalMR)
IElogitNor.proc=process.data(IELogitNormalMR,model="IELogitNormalMR",
	counts=list("Marked Superpopulation"=c(28, 29, 30, 30, 30, 33, 33, 33, 33, 34, 34, 34),
	"Unmarked Seen"=c(264, 161, 152, 217, 217, 160, 195, 159, 166, 152, 175, 190),
	"Marked Unidentified"=c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)),
		         time.intervals=c(0,0,0,1,0,0,0,1,0,0,0))
IElogitNor.ddl=make.design.data(IElogitNor.proc)
mod1=mark(IElogitNor.proc,IElogitNor.ddl,
		model.parameters=list(p=list(formula=~-1+session),
				              sigma=list(formula=~session),
							  alpha=list(formula=~-1+session:time),
							  Nstar=list(formula=~session),
							  Nbar=list(formula=~session)),delete=TRUE)
summary(mod1)
# You can use the initial value to get a better estimate.
 mod2=mark(IElogitNor.proc,IElogitNor.ddl,
		  model.parameters=list(p=list(formula=~-1+session:time),
							  sigma=list(formula=~session),
							  alpha=list(formula=~-1+session:time),
							  Nstar=list(formula=~session),
							  Nbar=list(formula=~session)),
							  initial=mod1,delete=TRUE)
 summary(mod2)			  
Example of LogitNormal Mark-Resight model
Description
Data and example illustrating LogitNormal Mark-Resight model.
Format
A data frame with 35 observations on the following variable.
- ch
- a character vector 
Examples
data(LogitNormalMR)
logitNor.proc=process.data(LogitNormalMR,model="LogitNormalMR",
		counts=list("Unmarked seen"=c(96,68,59),
				    "Marked Unidentified"=c(0,0,0,0,1,1,1,0,0,3,0,1)),
			         time.intervals=c(0,0,0,1,0,0,0,1,0,0,0))
logitNor.ddl=make.design.data(logitNor.proc)
# Note that to get good starting values yo should specify a formula that allows use of the sin link
# MARK will ignore the use of the sin link and use log for parameters N and sigma
# after fitting this intial model you can use it for starting values with other model that
# do not require the sin link but always check to make sure the model is converging to 
# reasonable values.
mod=mark(logitNor.proc,logitNor.ddl,
 model.parameters=list(N=list(formula=~-1+session,link="sin"),
 sigma=list(formula=~-1+session,link="sin"),p=list(formula=~1,link="sin")),delete=TRUE)
summary(mod)
Convert Multistate data for POPAN-style abundance estimation
Description
Converts data and optionally creates and structures design data list such that population size can be derived with multistate data. Variance estimate is questionable.
Usage
MS_popan(
  x,
  augment_num = 100,
  augment_stratum = "A",
  enter_stratum = "B",
  strata = NULL,
  begin.time = 1,
  groups = NULL,
  ddl = FALSE,
  time.intervals = NULL
)
Arguments
| x | an RMark dataframe | 
| augment_num | the number to add with a capture history of all 0s; this is the expected number that were in the population and not ever seen | 
| augment_stratum | the single character to represent outside of the population; use a value not used in the data capture history | 
| enter_stratum | the single character to represent inside of the population but not yet entered; use a value not used in the data capture history | 
| strata | vector of single characters for observed and unobserved states | 
| begin.time | beginning time of observed occasions; two occasions are added to the fron of the capture history at times begin.time-1 and begin.time-2 | 
| groups | vector of character variable names of factor variables to use for grouping | 
| ddl | if TRUE, will return processed data and a design data list with the appropriate fixed parameters. | 
| time.intervals | intervals of time between observed occasions | 
Author(s)
Jeff Laake
Examples
data(dipper)
popan_N=summary(mark(dipper,model="POPAN",
        model.parameters=list(pent=list(formula=~time)),delete=TRUE),se=TRUE)$reals$N
data.list=MS_popan(dipper,ddl=TRUE,augment_num=30)
modMS=mark(data.list$data,data.list$ddl,
        model.parameters=list(Psi=list(formula=~B:toB:time)),brief=TRUE,delete=TRUE)
Psi_estimates=summary(modMS,se=TRUE)$reals$Psi
Nhat_MS=Psi_estimates$estimate[1]*sum(abs(data.list$data$data$freq))
se_Nhat_MS=Psi_estimates$se[1]*Nhat_MS
cat("Popan N = ",popan_N$estimate," (se = ",popan_N$se,")\n")
cat("MS N = ",Nhat_MS," (se = ",se_Nhat_MS,")\n")
Truncate capture histories for multi-state models
Description
Decompose full capture history to releases followed by k recapture occasions. If a recapture occasion occurs before k occasions, the capture history is finished at the first recapture and right padded with "." which effectively acts like a loss on capture. The recapture is then a new release and new capture history. If there are no recaptures within k occasions, it has a release followed by k 0's. If the release is such that adding k occasions is greater than the length of the original capture history, then the new history is left padded with 0's. Capture histories that end with a capture on the last occasion do not generate a new capture history because there are no possible recaptures and thus contain no information in a CJS format MS model. All freq and covariates are copied with newly generated truncated capture histories.
Usage
MStruncate(data, k = 5)
Arguments
| data | dataframe containing at least one character field named ch; can also contain frequency in numeric field freq and any other covariates. | 
| k | number of recapture occasions after release; new capture histories are of length k+1 | 
Value
dataframe with field ch and freq (default to 1) and any covariates included in argument data; it also contains a factor variable release which is the first occasion which should be used as a group variable so the begin.time can be set for each release cohort to maintain the original times, as shown in the example.
Examples
data(mstrata)
df=MStruncate(mstrata,k=2)
dp=process.data(df,model="Multistrata",groups=c("release"),begin.time=1:max(as.numeric(df$release)))
ddl=make.design.data(dp)
table(ddl$S$release,ddl$S$time)
table(ddl$p$release,ddl$p$time)
Multiple Species Occupancy
Description
Example expected value data generated by Gary White for testing.
Format
A data frame with 32 observations of 2 variables
- ch
- a character vector containing the capture history (each is 2 character positions) for 4 occasions with 5 species 
- freq
- capture history frequency 
Author(s)
Jeff Laake
Examples
# read in expected value data from Gary White
data("NSpeciesOcc")
# process data and specify number of species=5 in the mixtures argument
dp=process.data(NSpeciesOcc,model="NSpeciesOcc",mixtures=5)
# make design data and fit model used to generate the data
ddl=make.design.data(dp)
model=mark(dp,ddl,model.parameters=list(f=list(formula=~-1+mixture,link="sin"),
p=list(formula=~1,link="sin")),delete=TRUE)
Multi-state occupancy example data
Description
An occupancy data set for modelling multi-state data (0,1,2).
Format
A data frame with 40 records for 54 observations (sites) on the following 2 variables.
- ch
- a character vector containing the presence (state 1), presence (state 2), and absence (0) for each visit to the site, and a "." if the site was not visited 
- freq
- frequency of sites with that history 
Details
This is a data set from Nichols et al (2007).
References
Nichols, J. D., J. E. Hines, D. I. MacKenzie, M. E. Seamans, and R. J. Gutierrez. 2007. Occupancy estimation and modeling with multiple states and state uncertainty. Ecology 88:1395-1400.
Examples
# This example is excluded from testing to reduce package check time
# To create the data file use:
# NicholsMSOccupancy=convert.inp("NicholsMSOccupancy.inp")
#
# Create a function to fit the 12 models in Nichols et al (2007).
do.MSOccupancy=function()
{
#  Get the data
   data(NicholsMSOccupancy)
# Define the models; default of Psi1=~1 and Psi2=~1 is assumed
   # p varies by time but p1t=p2t
   p1.p2equal.by.time=list(formula=~time,share=TRUE)  
   # time-invariant p p1t=p2t=p1=p2
   p1.p2equal.dot=list(formula=~1,share=TRUE)    
   #time-invariant p1 not = p2
   p1.p2.different.dot=list(p1=list(formula=~1,share=FALSE),p2=list(formula=~1))  
   # time-varying p1t and p2t
   p1.p2.different.time=list(p1=list(formula=~time,share=FALSE),p2=list(formula=~time)) 
   # delta2 model with one rate for times 1-2 and another for times 3-5; 
   #delta2 defined below
   Delta.delta2=list(formula=~delta2) 
   Delta.dot=list(formula=~1)  # constant delta
   Delta.time=list(formula=~time) # time-varying delta
# Process the data for the MSOccupancy model
   NicholsMS.proc=process.data(NicholsMSOccupancy,model="MSOccupancy")
# Create the default design data
   NicholsMS.ddl=make.design.data(NicholsMS.proc)
# Add a field for the Delta design data called delta2.  It is a factor variable
# with 2 levels: times 1-2, and times 3-5.
   NicholsMS.ddl=add.design.data(NicholsMS.proc,NicholsMS.ddl,"Delta",
     type="time",bins=c(0,2,5),name="delta2")
# Create a list using the 4 p modls and 3 delta models (12 models total)
   cml=create.model.list("MSOccupancy")
# Fit each model in the list and return the results
   return(mark.wrapper(cml,data=NicholsMS.proc,ddl=NicholsMS.ddl,delete=TRUE))
}
# Call the function to fit the models and store it in MSOccupancy.results
MSOccupancy.results=do.MSOccupancy()
# Print the model table for the results
print(MSOccupancy.results)
# Adjust model selection by setting chat=1.74
MSOccupancy.results=adjust.chat(chat=1.74,MSOccupancy.results)
# Print the adjusted model selection results table
print(MSOccupancy.results)
#
# To fit an additive model whereby p1 and p2 differ by time and p2 differs from
# p1 a constant amount on the logit scale, use
#
# p varies by time logit(p1t)=logit(p2t)+constant
p1.plust.p2.by.time=list(formula=~time+p2,share=TRUE) 
Display PIM for a parameter
Description
Extract PIMS for a particular parameter and display either the full PIM structure or the simplified PIM structure.
Usage
PIMS(model, parameter, simplified = TRUE, use.labels = TRUE)
Arguments
| model | mark model object | 
| parameter | character string of a particular type of parameter in the model (eg "p","Phi","pent","S") | 
| simplified | if TRUE show simplified PIM structure; otherwise show full structure | 
| use.labels | if TRUE, uses time and cohort labels for columns and rows respectively | 
Value
None
Author(s)
Jeff Laake
See Also
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
results=mark(dipper,delete=TRUE)
PIMS(results,"Phi")
PIMS(results,"Phi",simplified=FALSE)
 
Mulstistate Live-Dead Paradise Shelduck Data
Description
Paradise shelduck recapture and recovery data in multistrata provided by Richard Barker and Gary White.
Format
A data frame with 1704 observations of 3 variables
- ch
- a character vector containing the capture history (each is 2 character positions LD) for 6 occasions 
- freq
- capture history frequency 
- sex
- Male or Female 
Author(s)
Jeff Laake
References
Barker, R.J, White,G.C, and M. McDougall. 2005. MOVEMENT OF PARADISE SHELDUCK BETWEEN MOLT SITES: A JOINT MULTISTATE-DEAD RECOVERY MARK–RECAPTURE MODEL. JOURNAL OF WILDLIFE MANAGEMENT 69(3):1194–1201.
Examples
# In the referenced article, there are 3 observable strata (A,B,C) and 
# 3 unobservable strata (D,E,F). This example is setup by default to use only  
# the 3 observable strata to avoid problems with multiple modes in the likelihood.  
# Code that uses all 6 strata are provided but commented out. With unobservable strata,
# simulated annealing should be used (options="SIMANNEAL")
data("Paradise_shelduck")
# change sex reference level to Male to match design matrix used in MARK
ps$sex=relevel(ps$sex,"Male")
# Process data with MSLiveDead model using sex groups and specify only observable strata
ps_dp=process.data(ps,model="MSLiveDead",groups="sex",strata.labels=c("A","B","C"))
# Process data with MSLiveDead model using sex groups and
# specify observable and unboservable strata
# ps_dp=process.data(ps,model="MSLiveDead",groups="sex",
#                        strata.labels=c("A","B","C","D","E","F"))
# Make design data and specify constant PIM for Psi to reduce parameter space. No time variation
# allowed in Psi in the article.
ddl=make.design.data(ps_dp,parameters=list(Psi=list(pim.type="constant")))
# Fix p to 0 for unobservable strata (only needed if they are included)
ddl$p$fix=NA
ddl$p$fix[ddl$p$stratum%in%c("D","E","F")]=0
# Fix p to 0 for last occasion
ddl$p$fix[ddl$p$time==6]=0.0
# Fix survival to 0.5 for last interval to match MARK file (to avoid confounding)
ddl$S$fix=NA
ddl$S$fix[ddl$S$time==6]=0.5
# create site variable for survival which matches A with D, B with E and C with F 
ddl$S$site="A"
ddl$S$site[ddl$S$stratum%in%c("B","C")]=as.character(ddl$S$stratum[ddl$S$stratum%in%c("B","C")])
ddl$S$site[ddl$S$stratum%in%c("E")]="B"
ddl$S$site[ddl$S$stratum%in%c("F")]="C"
ddl$S$site=as.factor(ddl$S$site)
# create same site variable for recovery probability (r)
ddl$r$site="A"
ddl$r$site[ddl$r$stratum%in%c("B","C")]=as.character(ddl$r$stratum[ddl$r$stratum%in%c("B","C")])
ddl$r$site[ddl$r$stratum%in%c("E")]="B"
ddl$r$site[ddl$r$stratum%in%c("F")]="C"
ddl$r$site=as.factor(ddl$r$site)
# Specify formula used in MARK model
S.1=list(formula=~-1+sex+time+site)
p.1=list(formula=~-1+stratum:time)
r.1=list(formula=~-1+time+sex+site)
Psi.1=list(formula=~-1+stratum:tostratum)
# Run top model from paper but only for observable strata
top_model=mark(ps_dp,ddl,model.parameters=list(S=S.1,p=p.1,r=r.1,Psi=Psi.1),delete=TRUE)
# Run top model from paper for all strata using simulated annealing (commented out)
#top_model=mark(ps_dp,ddl,model.parameters=list(S=S.1,p=p.1,r=r.1,Psi=Psi.1),
#      options="SIMANNEAL",delete=TRUE)
Example of Poisson Mark-Resight model
Description
Data and example illustrating Poisson Mark-Resight model.
Format
A data frame with 68 observations on the following 1 variables.
- ch
- a character vector 
Examples
# This example is excluded from testing to reduce package check time
data(PoissonMR)
pois.proc=process.data(PoissonMR,model="PoissonMR",
		counts=list("Unmarked Seen"=c(1380, 1120, 1041, 948),
				    "Marked Unidentified"=c(8,10,9,11),
					"Known Marks"=c(45,67,0,0)))
pois.ddl=make.design.data(pois.proc)
mod=mark(pois.proc,pois.ddl,
		model.parameters=list(Phi=list(formula=~1,link="sin"),
			      GammaDoublePrime=list(formula=~1,share=TRUE,link="sin"),
			      alpha=list(formula=~-1+time,link="log"),
				  U=list(formula=~-1+time,link="log"),
				  sigma=list(formula=~-1+time,link="log")),
		          initial=c(1,1,1,1,-1.4,-.8,-.9,-.6,6,6,6,6,2,-1),threads=2,delete=TRUE)
summary(mod)
Example of Poisson Mark-Resight model
Description
Data and example illustrating Poisson Mark-Resight model with 2 groups and one occasion.
Format
A data frame with 93 observations on the following 2 variables.
- ch
- a character vector 
- pg
- a factor with levels - group1- group2
Examples
 
# This example is excluded from testing to reduce package check time
data(Poisson_twoMR)
pois.proc=process.data(Poisson_twoMR,model="PoissonMR",groups="pg",
		counts=list("Unmarked Seen"=matrix(c(1237,588),nrow=2,ncol=1),
				    "Marked Unidentified"=matrix(c(10,5),nrow=2,ncol=1),
					"Known Marks"=matrix(c(60,0),nrow=2,ncol=1)))
pois.ddl=make.design.data(pois.proc)
mod=mark(pois.proc,pois.ddl,
		model.parameters=list(alpha=list(formula=~1),
				              U=list(formula=~-1+group),
				              sigma=list(formula=~1,fixed=0)),
	 	                      initial=c(0.9741405 ,0.0000000 ,6., 5.),threads=2,delete=TRUE)
summary(mod)	 	                 
Multi-scale dynamic occupancy models in RMark
Description
Multi-scale dynamic occupancy models in RMark
Author(s)
Connor M. Wood, University of Wisconsin-Madison <cwood9 at wisc.edu>
Examples
## Multi-scale dynamic occupancy models in RMark: a worked example ##
# Study design and data structure:
# two sessions (i.e., seasons) and 346 sampling sites
# up to three secondary sampling periods per season
# up to three survey devices per sampling site
# import the sample data, RDMultScalOcc.sampledata.csv
pathtodata=paste(path.package("RMark"),"extdata",sep="/")
dt=read.csv(paste(pathtodata,"RDMultScalOcc.sampledata.csv",sep="/"))
dt[is.na(dt)]=0 # replace NAs with 0
dt$ch=as.character(dt$ch) # encounter histories (dt$ch) must be characters
# note: habitat variables (amount of open forest and average slope) were collected at 
# two spatial scales
# 'sOpen' and 'sSlope' represent the entire sampling site
# 'pOpen##' and 'pSlope##' represent conditions relevant to individual devices at each 
#  sampling period
#  the p-scale variables are coded [name][session][primary]; 
#  dt.ddl$p$primary indicates how this should be entered
#  in this case the values for p$primary varied among devices and between sessions, 
#  but were constant between secondary sampling periods
# create the Process Data MARK object
dt.pr=process.data(dt,model="RDMultScalOcc",
    time.intervals=c(0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0),mixtures=3)
# note: time.intervals refers to seasons, not secondary sampling periods
# note: mixtures refers to the number of devices
# create the design data object
dt.ddl=make.design.data(dt.pr)
# Examine the built-in covariates for each parameter
str(dt.ddl)
### Approach 1: Manually create (a somewhat arbitrary) set of models #######################
# Name variables for simplicity 
fit.models=function()
{
Open=list(formula=~sOpen)
pOpen=list(formula=~pOpen)
Slope=list(formula=~sSlope)
pSlope=list(formula=~pSlope)
Time=list(formula=~Time) #use carefully because different covariates use 'Time' in different ways
null.model=mark(dt.pr,dt.ddl,delete=TRUE)
mod1=mark(dt.pr,dt.ddl,model.parameters=list(p=Open),delete=TRUE)
#'mod2=mark(dt.pr,dt.ddl,model.parameters=list(p=pOpen),delete=TRUE)
mod3=mark(dt.pr,dt.ddl,model.parameters=list(p=Slope),delete=TRUE)
mod4=mark(dt.pr,dt.ddl,model.parameters=list(p=pSlope),delete=TRUE)
mod5=mark(dt.pr,dt.ddl,model.parameters=list(Psi=Open),delete=TRUE)
mod6=mark(dt.pr,dt.ddl,model.parameters=list(Psi=Slope),delete=TRUE)
mod7=mark(dt.pr,dt.ddl,model.parameters=list(Psi=Time),delete=TRUE)
mod8=mark(dt.pr,dt.ddl,model.parameters=list(Gamma=Open),delete=TRUE)
mod9=mark(dt.pr,dt.ddl,model.parameters=list(Epsilon=Slope),delete=TRUE)
mod10=mark(dt.pr,dt.ddl,model.parameters=list(Theta=list(formula=~time)),delete=TRUE) 
return(collect.models()) # View results sorted by AICc
}
results=fit.models()
###########################################################################################
### Approach 2: use mark.wrapper to create your models ####################################
fit.models=function()
{
 # Models for p 
 p.null=list(formula=~1)
 p.open=list(formula=~sOpen)
 p.popen=list(formula=~pOpen)
 p.slope=list(formula=~sSlope)
 p.pslope=list(formula=~pSlope)
 # Models for Psi
 Psi.null=list(formula=~1)
 Psi.open=list(formula=~sOpen)
 Psi.Slope=list(formula=~sSlope)
 #Model for Gamma
 Gamma.null=list(formula=~1)
 Gamma.open=list(formula=~sOpen)
 # Model for Epsilon 
 Epsilon.null=list(formula=~1)
 Epsilon.slope=list(formula=~sSlope)
 # Model for Theta
 # note: 'time' is defined in dt.ddl$Theta (use str(dt) to see all predefined variables)
 Theta.null=list(formula=~1)
 Theta.time=list(formula=~time)
 # create all combinations of these sub-models
 cml=create.model.list("RDMultScalOcc")
 results=mark.wrapper(cml,data=dt.pr,ddl=dt.ddl,delete=TRUE) 
 return(results)
}
fit.models() # creates all combinations of variables and prints AICc table
results
Robust Design occupancy example data
Description
A simulated data set on a breeding bird as an example of robust design occupancy modeling.
Format
A data frame with 35 observations on the following 12 variables
- ch
- A character vector containing the presence (1) and absence (0) or (.) not visited for each of 3 visits (secondary occasions) over 3 years (primary occasions) 
- cover
- percentage canopy cover at each sampled habitat 
- occ11
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ12
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ13
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ21
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ22
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ23
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ31
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ32
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- occ33
- one of 9 session-dependent variables occ11 to occ33 containing the week the survey was conducted; p is the primary session number and s is the secondary session number 
- samplearea
- continuous variable indicating area size (ha) of the sampled habitat 
Details
These are simulated data for an imaginary situation with 35 independent
'sites' on which presence/absence of a breeding bird is recorded 3 times
annually for 3 years. Potential variables influencing site occupancy are
the size of the site in hectares (samplearea) and canopy cover percentage
(cover). The timing of the surveys within the year is thought to influence
the detection of occupancy, so the week the survey was conducted is included
in 9 variables that are named as occps where p is the primary session (year)
number and s is the secondary session (visit) number. Using
data(RDOccupancy) will retrieve the completed dataframe and using
example(RDOccpancy) will run the example code. However, in this
example we also show how to import the raw data and how they were modified
to construct the RDOccupancy dataframe.
For this example, the raw data are shown below and the code below assumes
the file is named RD_example.txt.
ch samplearea cover occ11 occ12 occ13 occ21 occ22 occ23 occ31 occ32 occ33 11011.100 12 0.99 1 5 6 2 4 . 1 5 8 000110100 9 0.64 4 5 8 1 2 7 2 5 9 10.100110 9 0.21 1 2 . 1 5 8 2 3 6 110000100 8 0.54 2 5 9 5 8 11 2 5 8 111101100 15 0.37 1 3 5 6 8 9 5 7 12 11..11100 10 0.04 1 2 . . 2 3 5 8 14 100000100 17 0.58 2 3 8 5 6 7 2 . 9 100110000 9 0.38 5 8 14 1 2 8 5 8 16 1001.0100 6 0.25 4 6 8 1 . 3 1 5 6 1.110000. 17 0.34 1 . 4 3 5 9 4 5 . 111100000 3 0.23 1 2 3 4 5 6 7 8 9 000000000 15 0.87 1 2 8 2 5 6 3 7 11 1111.0010 8 0.18 1 2 4 1 . 3 2 3 . 10011011 . 7 0.72 2 4 5 2 6 7 1 2 . 110001010 14 0.49 2 5 6 4 8 9 11 12 13 101.10100 13 0.31 1 2 3 . 2 5 1 4 6 100000010 10 0.6 1 5 7 8 9 10 5 8 9 010100010 12 0.67 1 4 5 2 6 8 3 4 7 110.01110 11 0.71 1 2 3 . 4 6 1 2 7 10.11.100 10 0.26 1 2 . 1 2 . 1 5 6 110100.10 9 0.56 1 4 7 2 3 4 . 2 7 010000000 10 0.16 1 5 7 8 9 11 6 7 8 000000.00 10 0.46 1 2 5 2 5 8 . 3 4 1.0000100 12 0.69 2 . 4 5 7 9 1 2 4 100010000 11 0.42 1 2 3 4 5 6 7 8 9 000000000 12 0.42 2 5 6 5 8 9 1 3 4 0.1100110 8 0.72 1 . 5 2 5 8 1 5 7 11.100100 11 0.51 1 5 . 1 2 4 4 5 6 000000000 11 0.37 1 2 3 4 5 6 7 8 9 001100111 12 0.54 1 2 3 1 2 3 1 2 3 10.1.1100 9 0.37 1 2 . 3 . 5 1 6 8 000000000 7 0.38 1 5 7 6 8 11 1 9 14 1011.0100 8 0.35 1 5 7 2 . 5 1 3 4 100110000 9 0.86 1 2 4 2 3 6 1 2 4 11.100111 8 0.57 1 5 . 2 6 7 1 3 5
The data could be read into a dataframe with code as follows:
RDOccupancy<-read.table("RD_example.txt",
colClasses=c("character", rep("numeric",2), rep("character", 9)),
header=TRUE)
Note that if the file was not in the same working directory as your
workspace (.RData) then you can set the working directory to the directory
containing the file by using the following command before the
read.table.
setwd(your working directory location here)
In the data file "." represents a site that was not visited on an occasion.
Those "." values are read in fine because ch is read in as a
character string. However, "." has also been used in the file in place of
numeric values of the occ variable. Because "." is not numeric, R
will coerce the input value to an NA value for each "." and will treat
the column they are in as a factor.  Thus, the "NA" will not be a
valid numeric value for MARK, so we need to change it to a number. To avoid
the coercion, the occ values were read in as characters and the
following code changes all "." to "0" and then coverts the fields to numeric
values:
 for (i in 4:12) { RDOccupancy[RDOccupancy[,i]==".",i]="0"
RDOccupancy[,i]=as.numeric(RDOccupancy[,i]) } 
It is fine to use zero (or any numeric value) in place of missing values for session-dependent covariates as the "0's" provide no information for modeling as they are tied to un-sampled occasions. However, all values of a site-specific covariate (e.g., cover) are used, so there cannot be any missing values. Note, however that use of "0's" in the time-dependent covariates will influence predictions output by MARK for that parameter, as they will be biased low due to the zero's being included in estimating the mean for that parameter.
The code below and associated comments provide a self contained example for
importing, setting up, and evaluating the any of the general robust design
type models (RDOccupEG, RDOccupPE, RDOccupPG) using RMARK. Unlike standard
occupancy designs, robust designs require the user to designate primary and
secondary occasions using the argument time.intervals. For this
example, we have 3 primary occasions (year) with 3 secondary sampling
occasions within each year, thus, we would set our time.intervals as
follows to represent 0 interval between secondary occasions and interval of
1 (years in this case) between primary occasions:
time.intervals=c(0,0,1,0,0,1,0,0)
The first 0 designates the interval between the first and second sampling
occasion in year 1, the second 0 designates the interval between the second
and third sampling occasion in year 1, and the 1 indicated the change from
primary period 1 to primary period 2. See process.data for
more information on the use of time.intervals.
Author(s)
Bret Collier
Examples
# This example is excluded from testing to reduce package check time
data(RDOccupancy)
#
# Example of epsilon=1-gamma
test_proc=process.data(RDOccupancy,model="RDOccupEG",time.intervals=c(0,0,1,0,0,1,0,0))
test_ddl=make.design.data(test_proc)
test_ddl$Epsilon$eps=-1
test_ddl$Gamma$eps=1
p.dot=list(formula=~1)
Epsilon.random.shared=list(formula=~-1+eps, share=TRUE)
model=mark(test_proc,test_ddl,model.parameters=list(Epsilon=Epsilon.random.shared, p=p.dot),
            delete=TRUE)
#
# A self-contained function for evaluating a set of user-defined candidate models
run.RDExample=function()
{
# Creating list of potential predictor variables for Psi
Psi.area=list(formula=~samplearea)
Psi.cover=list(formula=~cover)
Psi.areabycover=list(formula=~samplearea*cover)
Psi.dot=list(formula=~1)
Psi.time=list(formula=~time)
# Creating list of potential predictor variables for p
# When coding formula with session-dependent (primary or secondary)
# covariates, you do NOT have to include the session identifiers (
# the ps of occps) in the model formula. You only need to specify ~occ.
# The variable suffix can be primary occasion numbers or
# primary and secondary occasion numbers.
p.dot=list(formula=~1)
p.occ=list(formula=~occ)
p.area=list(formula=~sample.area)
p.coverbyocc=list(formula=~occ*cover)
# Creating list of potential predictor variables for Gamma
# and/or Epsilon (depending on which RDOccupXX Parameterization is used)
gam.area=list(formula=~samplearea)
epsilon.area=list(formula=~samplearea)
gam.dot=list(formula=~1)
epsilon.dot=list(formula=~1)
# setting time intervals for 3 primary sessions with
# secondary session length of 3,3,3
time_intervals=c(0,0,1,0,0,1,0,0)
# Initial data processing for RMARK RDOccupPG
# (see RMARK appendix C-3 for list of RDOccupXX model paramterizations)
RD_process=process.data(RDOccupancy, model="RDOccupPG",
time.intervals=time_intervals)
RD_ddl=make.design.data(RD_process)
# Candidate model list
# 1. Occupancy, detection, and colonization are constant
model.p.dot.Psi.dot.gam.dot<-mark(RD_process, RD_ddl,
model.parameters=list(p=p.dot, Psi=Psi.dot, Gamma=gam.dot),
invisible=TRUE,delete=TRUE)
# 2. Occupancy varies by time, detection is constant,
# colonization is constant
model.p.dot.Psi.time.gam.dot<-mark(RD_process, RD_ddl,
model.parameters=list(p=p.dot, Psi=Psi.time, Gamma=gam.dot),
invisible=TRUE,delete=TRUE)
# 3. Occupancy varies by area, detection is constant,
# colonization varies by area
model.p.dot.Psi.area.gam.area<-mark(RD_process,
RD_ddl, model.parameters=list(p=p.dot, Psi=Psi.area,
Gamma=gam.area), invisible=TRUE,delete=TRUE)
# 4. Occupancy varies by cover, detection is constant,
# colonization varies by area
model.p.dot.Psi.cover.gam.area<-mark(RD_process, RD_ddl,
model.parameters=list(p=p.dot, Psi=Psi.cover, Gamma=gam.area),
invisible=TRUE,delete=TRUE)
# 5. Occupancy is constant, detection is session dependent,
# colonization is constant
model.p.occ.Psi.dot.gam.dot<-mark(RD_process, RD_ddl,
model.parameters=list(p=p.occ, Psi=Psi.dot, Gamma=gam.dot),
invisible=TRUE,delete=TRUE)
# 6. Occupancy varied by area, detection is session
# dependent, colonization is constant
model.p.occ.Psi.area.gam.dot<-mark(RD_process, RD_ddl,
model.parameters=list(p=p.occ, Psi=Psi.area, Gamma=gam.dot),
invisible=TRUE,delete=TRUE)
#
# Return model table and list of models
#
return(collect.models())
}
# This runs the 6 models above-Note that if you use
# invisible=FALSE in the above model calls
# then the mark.exe prompt screen will show as each model is run.
robustexample<-run.RDExample() #This runs the 6 models above
# Outputting model selection results
robustexample 	# This will print selection results
#options(width=150)	# Sets page width to 100 characters
#sink("results.table.txt") # Captures screen output to file
# Remove comment to see output
#print(robustexample) # Sends output to file
#sink() # Returns output to screen
#
# Allows you to view results in notepad;remove # to see output
# system("notepad results.table.txt", invisible=FALSE, wait=FALSE)
# Examine the output for Model 1: Psi(.), p(.), Gamma(.)
# Opens MARK results file in text editor
#robustexample$model.p.dot.Psi.dot.gam.dot
# View beta estimates for specified model in R
robustexample$model.p.dot.Psi.dot.gam.dot$results$beta
# View real estimates for specified model in R
robustexample$model.p.dot.Psi.dot.gam.dot$results$real
# Examine the best fitting model which has a time-dependent
# effect on detection
# (Model 5: Psi(.), p(occ), Gamma(.))
# View beta estimates for specified model in R
robustexample$model.p.occ.Psi.dot.gam.dot$results$beta
# View real estimates for specified model in R
robustexample$model.p.occ.Psi.dot.gam.dot$results$real
# View estimated variance/covariance matrix in R
robustexample$model.p.occ.Psi.dot.gam.dot$results$beta.vcv
# View model averages estimates for session-dependent
# detection probabilities
model.average(robustexample, "p", vcv=TRUE)
# View model averaged estimate for Psi (Occupancy)
model.average(robustexample, "Psi", vcv=TRUE)
# View model averaged estimate for Gamma (Colonization)
model.average(robustexample, "Gamma", vcv=TRUE)
#
# Compute real estimates across the range of covariates
# for a specific model parameter using Model 6
#
# Identify indices we are interested in predicting
# see covariate.predictions for information on
# index relationship to real parameters
summary.mark(robustexample$model.p.occ.Psi.area.gam.dot, se=TRUE)
# Define data frame of covariates to be used for analysis
ha<-sort(RDOccupancy$samplearea)
# Predict parameter of interest (Psi) across the
# range of covariate data of interest
Psi.by.Area<-covariate.predictions(robustexample,
data=data.frame(samplearea=ha), indices=c(1))
# View dataframe of real parameter estimates without var-cov
# matrix printing (use str(Psi.by.Area) to evaluate structure))
Psi.by.Area[1]
#Create a simple plot using plot() and lines()
plot(Psi.by.Area$estimates$covdata, Psi.by.Area$estimates$estimate,
type="l", xlab="Patch Area", ylab="Occupancy", ylim=c(0,1))
lines(Psi.by.Area$estimates$covdata, Psi.by.Area$estimates$lcl, lty=2)
lines(Psi.by.Area$estimates$covdata, Psi.by.Area$estimates$ucl, lty=2)
# For porting graphics directly to file, see pdf() or png(),
 
Robust design salamander occupancy data
Description
A robust design occupancy data set for modelling presence/absence data for salamanders.
Format
A data frame with 40 observations (sites) on the following 2 variables.
- ch
- a character vector containing the presence (1) and absence (0) with 2 primary occasions with 48 and 31 visits to the site 
- freq
- frequency of sites (always 1) 
Details
This is a data set that I got from Gary White which is suppose to be salamander data collected with a robust design.
Examples
# This example is excluded from testing to reduce package check time
fit.RDOccupancy=function()
{
   data(RDSalamander)
   occ.p.time.eg=mark(RDSalamander,model="RDOccupEG",
      time.intervals=c(rep(0,47),1,rep(0,30)),
      model.parameters=list(p=list(formula=~session)),threads=2,delete=TRUE)
   occ.p.time.pg=mark(RDSalamander,model="RDOccupPG",
      time.intervals=c(rep(0,47),1,rep(0,30)),
      model.parameters=list(Psi=list(formula=~time),
      p=list(formula=~session)),threads=2,delete=TRUE)
   occ.p.time.pe=mark(RDSalamander,model="RDOccupPE",
      time.intervals=c(rep(0,47),1,rep(0,30)),
      model.parameters=list(Psi=list(formula=~time),
      p=list(formula=~session)),threads=2,delete=TRUE)
return(collect.models())
}
RDOcc=fit.RDOccupancy()
print(RDOcc)
Multi-state Transition Functions
Description
TransitionMatrix: Creates a transition matrix of movement parameters for a multi-state(strata) model. It computes all Psi values for a multi-strata mark model and constructs a transition matrix. Standard errors and confidence intervals can also be obtained.
Usage
TransitionMatrix(x,vcv.real=NULL)
        find.possible.transitions(ch)
        transition.pairs(ch)
Arguments
| x | Estimate table from  | 
| vcv.real | optional variance-covariance matrix from the call to
 | 
| ch | vector of capture history strings for a multi-state analysis | 
Details
find.possible.transitions: Finds possible transitions; essentially it identifies where stratum label A and B are in the same ch for all labels but the the transition could be from A to B or B to A or even ACB which is really an A to C and C to B transition.
transition.pairs: Computes counts of transition pairs. The rows are the "from stratum" and the columns are the "to stratum". So AB would be in the first row second column and BA would be in the second row first column. All intervening 0s are ignored. These are transition pairs so AB0C is A to B and B to C but not A to C.
Value
TransitionMatrix: returns either a transition matrix (vcv.real=NULL) or a list of matrices (vcv.real specified) named TransitionMat (transition matrix), se.TransitionMat (se of each transition), lcl.TransitionMat (lower confidence interval limit for each transition), and ucl.TransitionMat (upper confidence interval limit for each transition). find.possible.transitions returns a 0/1 table where 1 means that t both values are in one or more ch strings and transition.pairs returns a table of counts of transition pairs.
Author(s)
Jeff Laake
See Also
Examples
# This example is excluded from testing to reduce package check time
data(mstrata)
# Show possible transitions in first 15 ch values
find.possible.transitions(mstrata$ch[1:15])
# Show transtion pairs for same data
transition.pairs(mstrata$ch[1:15])
#limit transtions to 2 and 3 character values for first 30 ch
transition.pairs(substr(mstrata$ch[1:30],2,3))
# fit the sequence of multistrata models as shown for ?mstrata
run.mstrata=function()
{
#
# Process data
#
mstrata.processed=process.data(mstrata,model="Multistrata")
#
# Create default design data
#
mstrata.ddl=make.design.data(mstrata.processed)
#
#  Define range of models for S; note that the betas will differ from the output
#  in MARK for the ~stratum = S(s) because the design matrix is defined using
#  treatment contrasts for factors so the intercept is stratum A and the other
#  two estimates represent the amount that survival for B abd C differ from A.
#  You can use force the approach used in MARK with the formula ~-1+stratum which
#  creates 3 separate Betas - one for A,B and C.
#
S.stratum=list(formula=~stratum)
S.stratumxtime=list(formula=~stratum*time)
#
#  Define range of models for p
#
p.stratum=list(formula=~stratum)
#
#  Define range of models for Psi; what is denoted as s for Psi
#  in the Mark example for Psi is accomplished by -1+stratum:tostratum which
#  nests tostratum within stratum.  Likewise, to get s*t as noted in MARK you
#  want ~-1+stratum:tostratum:time with time nested in tostratum nested in
#  stratum.
#
Psi.s=list(formula=~-1+stratum:tostratum)
#
# Create model list and run assortment of models
#
model.list=create.model.list("Multistrata")
#
# Add on a specific model that is paired with fixed p's to remove confounding
#
p.stratumxtime=list(formula=~stratum*time)
p.stratumxtime.fixed=list(formula=~stratum*time,fixed=list(time=4,value=1))
model.list=rbind(model.list,c(S="S.stratumxtime",p="p.stratumxtime.fixed",
               Psi="Psi.s"))
#
# Run the list of models
#
mstrata.results=mark.wrapper(model.list,data=mstrata.processed,ddl=mstrata.ddl,delete=TRUE)
#
# Return model table and list of models
#
return(mstrata.results)
}
mstrata.results=run.mstrata()
mstrata.results
# for the best model, get.real to get a list containing all Psi estimates
#  and the v-c matrix
Psilist=get.real(mstrata.results[[1]],"Psi",vcv=TRUE)
Psivalues=Psilist$estimates
# call Transition matrix using values from time==1; the call to the function
# must only contain one record for each possible transition. An error message is
# given if not the case
TransitionMatrix(Psivalues[Psivalues$time==1,])
# call it again but specify the vc matrix to get se and conf interval
TransitionMatrix(Psivalues[Psivalues$time==1,],vcv.real=Psilist$vcv.real)
Summary of changes by version
Description
A good place to look for changes. Often I'll add changes here but don't always get to it in the documentation for awhile. They are ordered from newest to oldest.
Details
Version 2.0.9 (1 Dec 2011)
- Patch was made to - make.mark.modelto fix bug in PIM creation for a multi-session model and there was just 1 session. Thanks to Erin Roche for helping to identify this bug.
- Patch was made to - make.mark.modelto fix bug in handling of mlogits for pi and Omega parameters with more than one group. These parameters were introduced with the RDMSMisClass and other new models that were recently added.
- Additional changes were made to - export.MARKto re-fix changes for robust and nest survival model export to MARK.
- A function - mark.wrapper.parallelwritten by Eldar Rakhimberdiev provides a parallel processing version of mark.wrapper. See the example in the help for the function. The parallel version is functionally the same and can be used in place of- mark.wrapperto run sequentially or in parallel. It does not include the run argument however.
- A bug was fixed in - get.realwhich caused an R error for a triangular PIM that had only a single entry. Thanks to Amanda Goldberg for discovering and reporting this error.
Version 2.0.8 (7 Oct 2011)
- Both - setup.modeland- setup.parameterswere re-written to use data files models.txt and parameters.txt to define models and parameters which should make it easier to add new models. The latter function is now much simpler and smaller.
- Model RDOccupEG now allows sharing Epsilon and Gamma parameters. Epsilon is the dominant parameter which gets the share=TRUE argument. See - RDOccupancyfor an example. Thanks to Jake Ivan for an example of what was needed.
- To avoid confusion, the arguments - componentand- component.namewere removed from the parameter specification because these have not been required since v1.3 when full support for individual covariates were included. Likewise the argument- covariatesin- markand- make.mark.modelwas removed because it was only needed to support the component approach.
-   export.MARKwas modified so that if all individual covariates are output, it excludes factor covariates.
- Many more models were added to those supported in RMark. Now 92 of the 137 models in MARK are supported. See MarkModels.pdf in the RMark directory of your R library to see which models are supported (in red). Most of the remaining unsupported models are versions with mis-identification error and they are not shown in MarkModels.pdf. 
- Previously RMark stored the input file in a temporary file Markxxx.tmp. Using a common filename caused problems when more than one model were spawned to different CPUs, so now it uses a random temporary file name. It also no longer uses common file markxxx.vcv. Thanks to Glenn Stauffer for testing these changes. 
- Sessions are now labelled using value of session time rather than numerically from 1 to largest session number in robust designs. Thanks to Tommy Garrison for this suggestion. 
- A bug was fixed in - get.realwhich caused incorrect assignments of fixed parameter values in the unusual case where a fixed parameter had a non-zero design matrix row.
-  mark.wrapperwas modified so it returns a list of the models that were constructed if run==FALSE. Thanks to Eldar Rakhimberdiev for the suggestion and code.
- Code was added to - extract.mark.outputto extract deviance degrees of freedom. Thanks again to Eldar for contributing this code.
- A bug was fixed in - make.mark.modelthat prevented use of sin link on within session parameters in a robust design model. Thanks to Tommy Garrison for reporting this bug.
- A bug was fixed in - make.mark.modelwhich prevented the use of time varying covariates with shared parameters. Thanks to Andre Breton for reporting this bug.
- simplify argument was removed from functions because I have not found a reason not to simplify and I have not been testing code with simplify=FALSE. 
Version 2.0.7 (25 August 2011)
- Change to - make.mark.modelto fix bug in which mlogits were incorrectly assigned in ORDMS model when both Psi and pent used mlogit links. Thanks to Glenn Stauffer for identifying and tracking down this error.
- Fixed - export.chdataand- export.MARKso Nest survival models can be exported. Also changed default of argument ind.covariates to "all" which will use all individual covariates in the data in the file sent to MARK. Thanks to Jay Rotella for his help.
- Made change to - process.dataand- markto include a new argument reverse, which if set to TRUE with model="Multistratum" will reverse the timing of transition and survival. See- mstratafor an example of a reverse multistratum model.
- Made change to - make.design.datato allow for zero time intervals in non-robust design model. This was needed to allow use of the reverse time structure in multi-state models. In addition, for the reverse multistate model the function now adds an occasion (occ) field to the design data because the time field will be constant when with a 0 time interval. The row names of the design matrix and real parameters was extended for this model to include occasion (o) and occasion cohort (oc) to create unique labels because cohort and time are not unique with 0 time intervals.
Version 2.0.6 (1 July 2011)
- Change to MR resight examples so the output does not appear in notepad which was causing problem with check on CRAN submission 
Version 2.0.5 (29 June 2011)
- Order of arguments for - model.average.marklistwere switched incorrectly such that ... was the second argument. This has been fixed to the original format where ... is at the end. This resulted in the value of any arguments other than the first to be ignored unless specifically named e.g.,- parameter="Phi". Thanks to Rod Towell for reporting this error.
- 
Additional changes were made to .First.lib to 1) examine any MarkPath setting, 2) look in C:/program files/mark or c:/program files (x86)/mark, 3) or to search the path. If MARK executable cannot be found in any of those ways then a warning is issued that the user needs to set MarkPath to the path specification for the location of the MARK executable. Thanks to Bryan Wright for help with tracking down a bug. 
- A bug in - add.design.datawas fixed where the function gave incorrect results in pim.type was anything other than "all". Thanks to Jeff Hostetler for reporting this error.
- The mark-resight models PoissonMR, LogitNormalMR, and IELogitNormalMR were added. This required some changes to - make.mark.model,- make.design.dataand- compute.design.dataand the addition of an argument- countsto- process.datato provide mark-resight count data that are not in the capture history format. The format for- countsis a named list of vectors (one group) or matrices (each group is a row) where the names of the list elements are specific to the model. Currently there is no checking to make sure these are named correctly. Some of these models are very sensitive to starting values; thus the use of initial values in the examples. Thanks to Brett McClintock for his help incorporating these models.
Version 2.0.4 (1 June 2011)
- 
Change made to .First.lib to check for availability of mark software that depends on operating system. It now provides a warning rather than stopping package attachment. This allows the user to set MarkPath to some location outside of default location Program Files or Program Files (x86) without changing path which requires administrator privilege on Windows. 
- 
Change made to run.mark.modelto use shQuote because link to mark.exe was not working in some cases.
Version 2.0.3 (17 May 2011)
- Now requires R 2.13 to use path.package function 
- 
Change made to .First.lib to check for availability of mark software that depends on operating system. 
Version 2.0.2 (16 May 2011)
- 
MarkPath no longer needs to be set if mark.exe is no longer in the default location (c:/Program Files/mark) but is specified in the path. The code now uses the R function Sys.which to find the correct location for mark.exe, if it is contained in the Path. 
- Code for crm models was removed from RMark and moved to a different R package called marked that is under-development. This removes the FORTRAN code and accompanying dll and some functions and help files that were extraneous to RMark capabilities. There is still some code in some functions for crm that could be removed at some point. None of this matters to those who use RMark for its original purpose as an interface to mark.exe. 
- Many superficial changes were made to code so it could be posted on CRAN. Three changes that may be noticed by users involved renaming deriv.inverse.link, summary.ch and merge.design.covariates to deriv_inverse.link, summary_ch, and merge_design.covariates. These names conflicted with the generic functions deriv, summary and merge. It is only the last 2 that you may have in your scripts and you will have to rename them. The deprecated function merge.occasion.data was removed. Sorry for any inconvenience. 
- The dependency on Hmisc for the examples was removed by replacing errbar with plotCI. 
Version 2.0.1 (21 Feb 2011)
- Made a change to - run.mark.modelto handle output filenames that exceeded mark9999.
- Added an argument - prefixin- mark,- run.mark.modeland- cleanup. Like other parameters for- mark, it can also be used in- mark.wrapperas one of the ... arguments. Previously the mark files have always been named "marknnn.*". By specifying- prefixyou can now create sets of models with different prefixes. For example,- prefix="cu"would result in cu001.*(cu001.out, cu001.inp, etc), cu002.* etc. This provides the ability to name files to do things like naming them based on the species being analyzed. In general, there is no need to work with these files directly because the filename.* is stored with each- markobject and the various R functions use that link to provide the information from the files. If you use prefixes other than "mark", you'll need to call call- cleanupwith each prefix to remove unused files. See- run.mark.modelfor an example that shows use of the prefix argument to split the dipper data into separate analyses for each sex. Note that use of prefix was not mandatory here to separate the analyses but it provided a useful example.
- Additional usefulness has been coded for argument - initialfor assigning initial values to beta parameters. Previously, the options were either a vector of the same length as the new model to be run or a previously run model in a- markobject from which equivalent betas are extracted based on their names in the design matrix. Now if the vector contains names for the elements they will be matched with the new model like with the model option for initial. If any betas in the new model are not matched, they are assigned 0 as their initial value, so the length of the- initialvector no longer needs to match the number of parameters in the new model as long as the elements are named. The names can be retrieved either from the column names of the design matrix or from- rownames(x$results$beta)where- xis the name of the- markobject.
- Using the feature above, I added a new argument - use.initialto- mark.wrapper. If- use.initial=TRUE, prior to running a model it looks for the first model that has already been run (if any) for each parameter formula and constructs an- initialvector from that previous run. For example, if you provided 5 models for p and 3 for Phi in a CJS model, as soon as the first model for p is run, in the subsequent 2 models with different Phi models, the initial values for p are assigned based on the run with the first Phi model. At the outset this seemed like a good idea to speed up execution times, but from the one set of examples I ran where several parameters were at boundaries, the results were discouraging because the models converged to a sub-optimal likelihood value than the runs using the default initial values. I've left this option in but set its default value to FALSE.
- A possibly more useful argument and feature was added to - mark.wrapperin the argument- initial. Previously, you could use- initial=modeland it would use the estimates from that model to assign initial values for any model in the set defined in- mark.wrapper. Now I've defined- initialas a specific argument and it can be used as above but you can also use it to specify a- marklistof previously run models. When you do that, the code will lookup each new model to be run in the set of models specified by- initialand if it finds one with the matching name then it will use the estimates for any matching parameters as initial values in the same way as- initial=modeldoes. The model name is based on concatenating the names of each of the parameter specification objects. To make this useful, you'll want to adapt to an approach that I've started to use of naming the objects something like p.1,p.2 etc rather than naming them something like p.dot, p.time as done in many of the examples. I've found that using numeric approach is much less typing and cumbersome rather than trying to reflect the formula in the name. By default, the formula is shown in the model selection results table, so it was a bit redundant. Now where I see this being the most benefit. Individual covariate models tend to run rather slowly. So one approach is to run the sequence of models (eg results stored in initial_marklist), including the set of formulas with all of the variables other than individual covariates. Then run another set with the same numbering scheme, but adding the individual covariates to the formula and using- initial=initial_marklistThat will work if each parameter specification has the same name (eg., p.1=list(formula=~time) and then p.1=list(formula=~time+an_indiv_covariate)). All of the initial values will be assigned for the previous run except for any added parameters (eg. an_indiv_covariate) which will start with a 0 initial value.
- I added a new function - search.output.filesto the set of utility functions. This can be useful to search all of the output files in a- marklistfor a specific string like "numerical convergence suspect" or just "WARNING". The function returns the model numbers in the- marklistthat contain that string in the output file.
- Further changes were needed to - popan.derivedto handle data that are summarized (i.e. frequency of capture history >1). Thanks to Carl Schwarz for reporting and finding the change needed.
- A bug was fixed in - create.dmwas fixed so it would return a matrix instead of a vector with the formula ~1
Version 2.0.0 (14 Jan 2011)
- The packages msm, Hmisc, nlme, plotrix are now explicitly required to install RMark. These were used in examples or for specialty functions, but to avoid problems these must be installed as well. 
- Added example for "CRDMS" model that was created by Andrew Paul. See - crdms.
- Change was made for gamma link in v1.9.3 was only made to Pradel model and not Pradsen like it stated. Both now correctly use the logit link as the default for gamma. Thanks to Gina Barton for bringing this to my attention. 
- Change was made in - make.mark.modelthat prevented use of groups with Nest survival models. Thanks to Jeff Warren for bringing this to my attention.
- Change was made in - make.design.databecause re-ordering of parameters caused issues with the CRDMS model because the- subtract.stratumwere not being set for- Psi.
Version 1.9.9 (2 Nov 2010)
- This version was built with R 2.12 and will not work with earlier versions of R. It contains both 32 and 64 bit versions and R will automatically ascertain which to use. 
- Parameter ordering for some models (RDHet, RDFullHet, RDHHet, RDHFHet, OccupHet, RDOccupHetPE, RDOccupHetPG, RDOccupHetEG, MSOccupancy, ORDMS, and CRDMS) had to be changed such that the models could be imported into the MARK interface. This change can influence any code you have written for those models if you specified parameter indices because the ordering of the parameters were changed. For example, see change in example for - RDOccupancyto use indices=c(1) from c(10). Thanks to Gary White for helping me work this out.
- Modified code in - extract.mark.outputto handle cases with more than 9999 real parameters because MARK outputs **** when it exceeds 9999.
Version 1.9.8 (15 Sept 2010)
- Added - model="CRDMS"with parameters S, Psi, N, p, and c for closed robust design multi-state models
- Patched - popan.derivedwhich produced incorrect abundance estimates with unequal time intervals. Thanks to Andy Paul for finding this error and testing for me.
- Patched - export.MARKwhich failed for robust design models. Thanks to Dave Hewitt and Gary White for discovering and isolating the problem.
Version 1.9.7 (14 April 2010)
- 
Added model="Brownie"with parameters S and f which is the Brownie et al. parameterization of the recovery model. "Recovery only" (in RMark)model="Recovery"which is also encounter type "dead" in MARK uses the Seber parameterization with parameters S and r which is also used in the models for live and dead encounter models.
- Added - model="MSLiveDead"with parameters S, r, Psi and p. It is the multistate version of the Burnham model in which F=1.
- Added - compute.Snto utility functions for computation of natural survival from total survival when all harvest is reported. Patched- nat.survwhich was incorrectly rejecting based on model type.
- Added - var.components.remlto provide an alternate variance components estimation using REML or maximum likelihood. It allows a random component that is not iid which is all that- var.componentscan do.
- Replaced all T/F values with TRUE/FALSE to avoid conflicts with objects named TRUE or FALSE. 
Version 1.9.6 (1 February 2010)
- Writing the - popan.derivedfunction has led me down all sorts of paths. I had to make one small change to this function to handle externally saved models. However, various changes listed below were brought on by using this function with a relatively large POPAN model.
- Most importantly I discovered an error in computations of real parameters with an mlogit link in which some of the real parameters involved in the mlogit link were fixed. This is NOT a problem if you simply used the real parameter values extracted from MARK; however, if you were using either - compute.real(model.average uses this function) or- covariate.predictionsto compute those real parameters (with an mlogit link) then they may be incorrect. This would have been apparent because their value would have changed relative to the original values extracted from the MARK output. Correcting this error involved changes in- compute.real,- convert.link.to.realand- covariate.predictionsto correct the real parameter estimates and their standard errors. I've not found it in the MARK documentation but deduced that if you use the mlogit link and fix real parameters it uses those fixed real parameter values in the calculation. A simple example will make it clear. Consider pent for 5 occasions where the first is computed by subtraction and you then have 4 real parameters. Let's assume that the 3rd and fourth parameters were fixed to 0. Then the real parameters are calculated as follows: pent2=exp(beta2)/(1+ exp(beta2)+exp(beta3)+exp(0)+exp(0)), pent3=exp(beta3)/(1+ exp(beta2)+exp(beta3)+exp(0)+exp(0)), pent4=0, pent5=0 and pent1=1-pent2-pent3-0-0 (in this case). Obviously you would not want to fix any real parameters to be >1, <0 or to have the sum to be <0 or >1. This structure also had implications on how the standard error was calculated.
- In addition the coding was made more efficient in - covariate.predictionsfor the case where- data(index=somevector)is used without any data entries for covariate values in the design matrix. While the task performed with that use of the function could be done with- compute.real, it is useful to have the capability in- covariate.predictionsas well because it will then model average over the listed set of parameters. The previous approach to coding was inefficient and led to very large matrices that were unnecessary and could cause failure with insufficient memory for large analyses.
- Calculation of NGross was added to - popan.derivedand logical arguments- Nand- NGrosswere added to control what was computed in the call. In addition, argument- dropwas added which is passed to- covariate.predictionsto control whether models are dropped when variance of betas are not all positive.
-  var.componentswas modified to use qr matrix inversion. The returned value for beta is now a dataframe that includes the std errors which are extracted from the vcv matrix. Also, if the design matrix only uses a portion of the vcv matrix, the appropriate rows and columns are now extracted. Prior to this change, the standard errors would have been unreliable if the design matrix didn't use the entire set of thetas and vcv matrix.
Version 1.9.5 (4 December 2009)
- A bug in - covariate.predictionswas fixed that would assign fixed values incorrectly if the parameter indices were specified in anything but ascending order. The error would have been obvious to anyone that may have encountered it because estimated parameters would likely have been assigned a fixed value. In most cases indices would be passed in order if they were selected from the design data unless indices were chosen from more than one parameter type. I discovered it using the- popan.derivedfunction I added in v1.9.4 because it requests indices for multiple parameters in a single function call.
Version 1.9.4 (6 November 2009)
- Note that this version was built with R 2.10 which no longer supports compiled help files (chtml). If you were using compiled help files to get help with RMark, you'll need to switch to regular html files by using options(help_type="html") in R. You can put this command in your RProfile.site file so it is set up that way each time you start R. The functionality is the same but it is not as pretty. You can find the index (what used to be in a window on the left) as a link at the bottom of each help page. 
- Changed - export.MARKso it will not allow selection of a project name that would over-write an existing .inp file.
- Added the function - popan.derivedwhich for POPAN models computes derived abundance estimates by group and occasion and sum of group abundances for each occasion. For some reason RMark is unable to extract all of the derived parameters from the MARK binary file for POPAN models. This function provides the derived abundance estimates and their var-cov matrix and adds the abundance estimate sum across groups for each occasion which is not provided by MARK. Note that by default confidence intervals are based on a normal distribution to match the output of MARK, but if you want log-normal intervals use- normal=FALSE.
- The - model.average.listand- model.average.marklistfunctions were modified to use revised estimator for the unconditional standard error (eq 6.12 of Burnham and Anderson (2002)) which is now the default in MARK. To use eq 4.9 (the prior formula) set the argument- revised=FALSE.
- Fixed bug in - model.average.listwhich in some cases failed when the list of var-cov matrices were specified.
- Code in - model.average.marklistwas changed to set standard error to 0 if the variance is negative. The same is done in the var-cov matrix for the variance and any corresponding covariances. The results from RMark will now match the model average results from MARK for this case. It is not entirely clear that this is the best approach when ill-fitted models are included.
- Changed code in - cleanupto handle case in which a model did not run.
- Changed use of - grepand- regexprin- convert.inpand- extract.mark.outputto accomodate change in R.2.10. The code should work in earlier versions of R but if not update to R2.10.
- 
Created a function adjust.valueand kept special case ofadjust.chat. For any field other thanchatit will adjust the value inmodel$results. As an example, to adjust the effective sample size (ESS) usemodel.list=adjust.value("n",value,model.list)where value is replaced with the ESS you want to use. As part of the changemodel.tablewas changed to recompute AICc.
Version 1.9.3 (24 September 2009)
- Default link function for Gamma in the Pradel seniority model had been incorrectly set to log and has now been changed to logit to restrict it to be a probability. 
- A bug was fixed in - compute.realand- covariate.predictionsin which confidence intervals were being incorrectly scaled by c (chat adjustment) instead of sqrt(c). The reported standard errors were correctly using sqrt(c) and only the confidence intervals for the real predictions were too large. Simply re-running the prediction computations for a model will provide the correct results. There is no reason to re-run the models. Also this in no way affects model selection.
- A related bug was fixed in - compute.realand- covariate.predictionswhich created invalid confidence intervals for real parameters if a probability link other than logit was used and a single type of link was used for all real parameters (e.g., sin).
Version 1.9.2 (10 August 2009)
- Added the function - export.MARKwhich creates a .Rinp, .inp and optionally renames one or more output files for import to MARK. The July 2009 version of MARK now contains a File/RMARK Import menu item which will automatically create the MARK project using the information in these files. This prevents problems that have been encountered in creating MARK projects with RMark output because the data/group structures are setup exactly in MARK as they were in RMark. See- export.MARKfor an example and instructions.
- Fixed a problem in - make.design.datawhich prevented use of- remove.unusedwith unequal time intervals and more than one group.
- At least one person has encountered a problem with a very large number of parameters in which RMark created the input file with PIMs written in exponential notation for the larger indices. MARK will not accept that format and it will fail. The solution to this is to set the R option scipen to a positive number. Start with options(scipen=1) and increase if necessary. 
Version 1.9.1 (2 June 2009)
- Fixed a problem - make.design.datawhich was not using- begin.timeto label the session values
- Made a change in - export.chdatalike the change in- make.mark.modelto accomodate change with release of version R2.9.0.
- Made a change in - process.dataso that- strata.labelscan be specified for Multistrata designs like with ORDMS so an unobserved strata can be included.
- A warning was added to the help file for - export.chdataand- export.modelso it is clear that the MARK database must be created correctly and with the .inp file created by- export.chdatafrom the processed data that was used to create the models that are being exported. This is to ensure that the group structure is setup such that the assumed model structure for groups matches the model structure setup in the .inp file.
Version 1.9.0 (30 April 2009)
- Fixed a bug in - summary.markwhich occasionally produced erroneous results with- showall=FALSE.
- Made a change in - make.mark.modelto accomodate change with release of version R2.9.0.
- RMark now requires R version 2.8.1 or higher. 
Version 1.8.9 (9 March 2009)
- Changed - model.average.marklistand- covariate.predictionsto set NaN or Inf results in v-c matrix to 0 to cope with poorly determined models. Also, for each function the dropping of models is now restricted to cases in which there are negative variances for the betas being used in the averaged parameter estimates. Unused betas are ignored. For example, if- model.averageis called with- parameter="Phi", then the model will only be dropped if there is a negative variance for one of the betas associated with "Phi".
- In - var.componentsthe tolerance value (- tol) in the call to- unirootwas reduced to 1e-15 which should provide better estimates of the process variance when it is small. Previously a process variance less than 1e-5 would be treated as 0.
- Made changes to - cleanup,- coef.mark, and- make.mark.modelto accomodate externally saved model objects (- external=TRUE).
Version 1.8.8 (5 December 2008)
- An error was fixed in - make.time.factorwhich created incorrect assignments when only some of the time dependent variables contained a "." for occasions with no data.
- Patched - compute.design.datawhich was not creating the design data in the same order as the PIM construction for the newly added- ORDMSmodel.
- Generalized section of code in - make.mark.modelto handle- mlogitstructure for- ORDMSmodel.
- Fixed a bug in - process.datain which the initial ages were not correctly assigned in some situations with multiple grouping variables. Note that it is always a good idea to examine the design data after it is created to make sure it is structured properly because it relates the data and model structure via the grouping variables and the pre-defined variables (ie age, time etc), While I've done a lot of testing, I have certainly not tried every possible example and there is always the potential for an error to occur in a circumstance that I've not encountered.
Version 1.8.7 (13 November 2008)
- An argument - common.zerowas added to function- make.design.dataand- compute.design.data. It can be set to TRUE to make the- Timevariable have a common time origin of- begin.timewhich is useful for shared parameters like- pand- cin closed capture and similar models.
- The function - read.mark.binarywas patched to work with the newer versions of MARK.EXE since 1 Oct 2008.
- 
The model type ORDMSfor open robust design multi-state models was added. An example data set will be added at a later date after further testing has been completed.
- Some patches were made to fix some aspects of profile intervals and to fix adjustment by chat in - summary.markwhen- showall=F. The notation for profile intervals is now included in the field- model$results$real$notewhere- modelis the name of a mark model. Previously an incomplete notation was kept in- model$results$real$fixedbut that field is now used exclusively to denote fixed parameters. It is important to realize that profile intervals computed by MARK are only found in- model$results$real$noteand are not changed by a- chatadjustment unless the model is re-run. None of the intervals computed by- RMarkand displayed by- summary.markare profile intervals.
Version 1.8.6 (28 October 2008)
- A bug in an error message for - initial.agesin- process.datawas fixed.
- A new function - var.componentswas added to provide variance components capability as in the MARK interface except that shrinkage estimators are not computed currently.
- Fixed parameter values are now being reported correctly by - covariate.predictions. Also over-dispersion (c>1) was not being included in the variances for parameters except those using the mlogit link.
- Some utility functions were added including - pop.est,- nat.surv, and- extract.indices.
- The function - model.averagehas been changed to a generic function. Currently it supports 2 classes: 1) list, and 2) marklist. The latter was the original- model.averagewhich has been renamed- model.average.marklistand the first argument has been renamed- xinstead of- model.listto match the standard generic function approach. The previous syntax- model.average(...)will work as long as the usage does not name the first agument as in the example- model.average(model.list=dipper.results,...). The list formulation (- model.average.list) was created to enable a generic model averaging of estimates instead of just real parameter estimates from a- markmodel. It could be used with any set of estimates, model weights and estimates of precision.
- A change is needed to - read.mark.binaryto accomodate the change to mark.exe with the version dated 1 Oct 2008. Some data types (notably Nest survival) may not work with the new version of mark.exe. Working with Gary to make the patch. If you need an older version of mark.exe contact me.
Version 1.8.5 (8 October 2008)
- A bug in - process.datawas fixed that prevented use of a dataframe contained in a list while using the- groupsargument.
- Profile intervals on the real parameters can now be obtained from MARK using the arguments - profile.intand optionally- chatin- mark. The argument- profile.intcan be set to- TRUEand a profile interval will be constructed for all real parameters, or a vector of parameter indices can be specified to restrict the profiling to certain parameters. The value specified by- chatis passed to MARK for over-dispersion.
- 
References to cjs, js etc have been removed from here because this code was removed 5/11/11. 
- Yet another fix to - summary.chwhich gave incorrect results for the number recaptured at least once when- marray=Fand the data contained non-unity values for- freq.
Version 1.8.4 (29 August 2008)
- A generic function - coef.markwas added to extract the table of betas from the model with the expression- coef(model)where- modelis a- markmodel that has been run and contains output. The table includes standard errors and confidence intervals.
- An argument - briefwas added to- summary.mark. If- brief=TRUEthe real parameters are not included in the summary.
- References to cjs, js etc have been removed from here because this code was removed 5/11/11. 
- A bug in - summary.chwas fixed. It would produce erroneous results when the data contained a non-constant- freqfield. Results with the default of- freq=1were fine.
- The function - adjust.chatand its help file were changed such that it was clear that a- model.listargument was needed.
Version 1.8.3 (25 July 2008)
- For robust design models, an error trap was added to - process.datato make sure that the capture history length matches the specification for the- time.intervals. This error was already trapped for non-robust models.
- Fixed an error in - make.mark.modelthat prevented interaction model of session/time-specific individual covariates in a robust design model.
- 
Fixed an error in process.dataso that the fieldfreqis optional for nest survival data sets.
-  print.markwas modified to add an argumentinputwhich if set toinput=TRUEwill have the MARK input file be displayed rather than the output file. Also,wait=FALSEwas set in the system command which means the viewer window will be opened and you can carry on with R. Before you had to close the viewer window before proceeding with R.
- An example - RDOccupancyprovided by Bret Collier was added for the Robust Occupancy model which shows the use of session and time-varying individual covariates in a robust design model.
Version 1.8.2 (26 June 2008)
-  summary.chwas modified to allow missing cohorts (no captures/recaptures) for an occasion and to fix a bug in whichbygroup=FALSEdid not work when groups were defined.
- To avoid running out of memory, an argument - externalhas been added to- collect.models,- mark,- rerun.mark, and- run.mark.model. As with all arguments of- mark,- externalcan also be set in- mark.wrapper. Likewise,- externalcan also be set in- run.modelsand it is passed to- run.mark.model. The default is- external=FALSEbut if it is set to- TRUEthen the mark model object is saved in an external file with an extension- .rdaand the same base filename as its matching MARK output files. The mark object in the workspace is a character string which is the name of the file with the saved image (e.g., "mark001.rda"). If- external=TRUEwith- mark.wrapperthen the resulting marklist contains a list entry for each mark model which is only the filename and then the last entry is the- model.table. All of the functions recognize the dual nature of the mark object (i.e., filename or mark object) in the workspace. So even if the mark object only contains the filename, functions like- print.markor- summary.markwill work. However, if you have used- external=TRUEand you want to look at part of a mark object without using one of the functions, then use the function- load.model. Whereas, before you may have typed- mymark$results, if you use- external=TRUE, you would replace the above with- load.model(mymark)$results.
- Functions - storeand- restorewere created to- storeexternally and- restoremodels from external storage into the R workspace. They work on a- marklistand are only needed to- storeexternally existing marklist models or ones originally created with- external=FALSEor to- restoreif you change your mind and decide to keep them in the R workspace.
- Error in setup for robust design occupancy models with more than 2 primary sessions was fixed. The error resulted in mark.exe crashing. 
- The concept of time-varying individual covariates has been expanded to include robust design models which have both primary (session) and secondary (time) occasion-specific data. For a robust design, a time-varying individual covariate can be either session-dependent or session-time dependent. As an example, if there are 3 primary sessions and each has 4 secondary occasions, then the individual covariates can be named x1,x2,x3 to be primary session-dependent or named x11,x12,x13,x14,x21,x22,x23,x24,x31,x32,x33,x34. The value of x can be any name for the covariate. In the formula only the base name is used (e.g., - ~x) and RMark fills in the individual covariate names that it finds that match either the session or session-time individual covariates.
Version 1.8.1 (19 May 2008)
- Added function - summary.chto provide summaries of the capture history data (resighting matrices and m-arrays). It will not work with all types of models at present. It will work with CJS and Jolly-type models.
- Added argument - model.nameto- model.tableto be able to use alternate names in the model table. It can use either the model name with each mark object which uses a formula notation (the current approach) or it can use the name of the R object containing the mark model (- model.name=FALSE). See- model.tablefor an example. Also, the help file for- model.tablewas updated to reflect the code changes implemented in version 1.7.3.
- Code in - mark.wrapperwas modified to output the number of columns and column names of the design matrix for each model if- run=FALSE. This allows a check of each of the columns included in the model. By reviewing these you can assess whether the model was constucted as you intended. If there is any question you can either use- model.matrixor- make.mark.modelto examine the design matrix more thoroughly.
- A bug in the new function - merge.design.covariateswas fixed in which- mergewas sorting the design data which does obvious bad things. Adding- sort=FALSEdoes not appear to mean that the data frame is left in its original order. To prevent this, the dataframe is forced to remain in its original order by adding a sequence field for re-sorting after the merge.
Version 1.8.0 (8 May 2008)
- 
Fixed a bug in model.averagethat caused it to fail and issue an error when any of the models included a time dependent covariate in the parameter being averaged.
- Added - merge.design.covariateswhich is meant to replace- merge.occasion.data. This new function allows covariates to be assigned by- time,- timeand- group, or just- group. It also uses a simplified list of arguments and works with individual design dataframes rather than the entire ddl. It uses the R function- mergewhich can be used on its own to merge design covariates into the design data. You can use- mergedirectly as this function only checks for some common mistakes before it calls- mergeand it handles reassignemnt of row names in the case were design data have been deleted. An example, where you might want to use- mergeinstead of this function would be situations where the design data are not just group, time or group-time specific. For example, if groups were specified by two different factor variables say initial age and region and the design covariates were only region-specific. It would be more efficient to use- mergedirectly rather than this function which would require an entry for each group which would be each pairing of inital age and region. If you use- mergeand you deleted design data prior to merging, save the row.names, merge and then reassign the row.names.
- An argument - runwas added to- mark.wrapper. If set to FALSE, then it will run through each set of models in- model.listand try to build each model but does not attempt to run it. This is useful to check for and fix any errors in the formula before setting off a large run. If you use- run=FALSEdo not include arguments that are meant to be passed to- run.mark.modellike- adjust.
Version 1.7.9 (7 April 2008)
-  make.design.datawas fixed so thatremove.unused=Twill work properly when differentbegin.timevalues are specified for each group.
Version 1.7.8 (12 March 2008)
- Changed the default link for N to log in the - setup.parametersfor the- HetClosedand- FullHet. It was incorrectly set to logit which created incorrect estimates to be computed in- model.averagebecause MARK forces the log link for N regardless of what is set in the input file.
Version 1.7.7 (6 March 2008)
- Supressed warning message that occurred with code to check the validity of the sin link in - make.mark.model.
- 
Fixed a couple of bugs in covariate.predictionsthat prevented it from working for some cases after including code for the sin link.
- 
Added function release.gofto construct the RELEASE goodness of fit test and extract the TEST2 and TEST3 final chi-square, df and P-values.
Version 1.7.6 (26 Feb 2008)
- 
make.mark.modelwas modified to change the capitalization of the link functions and to remove all spaces after "=" in the input file for mark.exe. These differences were preventing the MARK interface from fully importing the model. Although the model would be imported and could be run inside the MARK interface, median c-hat would not run and would give an error stating "Invalid Link" for any model imported from RMark. Now transfering a model from RMark to the MARK interface is fully functional (I hope). If you want to import an output file that was created with a prior version of RMark without re-running it, use a text editor on the output file and remove any spaces before and after an = sign. Then change the capitalization of the links to "Logit", "MLogit", "Log", "LogLog", "CLogLog", "Identity".
- The sin link is now supported if the formula for the parameter generates an identity matrix for the parameter. For example, if you use ~-1+time instead of ~time then the resulting design matrix will be an identity for time. Likewise, for interactions use ~-1+group:time instead of ~group*time. If you select the sin link and the resulting design matrix is not an identity for the parameter, an error will be given and the run will stop. 
- To match the output from MARK, the confidence intervals for real parameters using any 0-1 link including loglog,cloglog,logit and sin are now computed using the logit transformation. For previous versions this will only affect any results that were using loglog and cloglog. Previously, it was using the chosen link to compute the se and the interval endpoints. The latter is still used for the log and identity links which are not bounded in 0-1. 
- The model "Jolly" was added to the supported list of models. Parameters include Phi,p,Lambda,N. It is not a particularly numerically stable model and often will not converge. Use of options="SIMANNEAL" in call to - markis recommended for better convergence. It will take much longer to converge but is mroe reliable.
Version 1.7.5 (24 Jan 2008)
- 
model.averagewas modified to ignore any models that did not run and either had no attached output file or no results.
- 
read.mark.binaryandextract.mark.outputwere modified to extract and store the real.vcv matrix (var-cov matrix of the simplified real parameters) in the mark object if realvcv=TRUE. The default is realvcv=FALSE. This argument has been added to functionsmark,run.mark.modelandrerun.mark.
- An argument delete has also been added to - markand- run.mark.model. The default value is FALSE but if set to TRUE it deletes all output files created by MARK after extracting the results. This is most useful for simulations that could easily create thousands of output files and after extracting the results the model objects are no longer needed. This is just a convenience to replace the need to call- cleanup.
Version 1.7.4 (10 Jan 2008)
- A bug in - make.mark.modelwas fixed. It was preventing creation of individual (site) covariate models for parameters with only a single parameter (single index) in certain circumstances like Psi1 in the MSOccupancy model.
- The fix to - merge.occasion.datain version 1.7.1 did not work when design data had been deleted. That has been remedied.
- Various functions with some operating specific calls have been modified so they will work on either Windows or Linux. Thus, the there is a single file for all source/help for both operating systems in RMarkSource.zip. It can be downloaded to either Windows or Linux to build the package. You need to build the package for Linux but not for Windows. For Windows, you only need RMark.zip which contains the pre-built package which only needs to be installed. Currently, with Linux the variable MarkPath is ignored and mark.exe is assumed to be in the path. Also, for Linux the default for MarkViewer is "pico" (an editor on some Linux machines). This can be modified in - print.markor by setting MarkViewer to a different value. The one Linux specific function is- read.mark.binary.linux. The function- extract.mark.outputcalls either- read.mark.binary.linuxor- read.mark.binarydepending on the operating system. A Linux version of mark.exe (32 or 64 bit) can be obtained from Evan.
Version 1.7.3 (4 Jan 2008)
- In working with the occupancy models, it became apparent that it would be useful to have a new function called - make.time.factorwhich creates time-varying dummy variables from a time-varying factor variable. An example is given using observer with the occupancy dataset- wetafrom the MacKenzie et al Occupancy modelling book.
- To match the results in the book, I added arguments - use.AICand- use.lnlto function- model.tableto construct a results table with AIC rather than AICc and -2LnL values. The latter is more useful with a mix of models some using individual covariates and others not.
- A modification was made to - make.mark.modelwith the- MSOccupancymodel to fix the name of the added data for parameter- p1when- share=TRUEto be- p2. For an example which uses p2 to construct an additive model, see- NicholsMSOccupancy.
Version 1.7.2 (20 Dec 2007)
- In changing code for the occupancy models, a brace was misplaced which prevented the nest survival models from working. This has been fixed. Also, the example code for - mallardand- killdeerwas modified to exclude the calls to process the input file. This enables use of the function- example()to run the example code (e.g.- example(mallard)). From now on as I add examples they are being included in my test set to avoid this type of problem in the future.
Version 1.7.1 (14 Dec 2007)
- If you update with this version of RMark make sure to update MARK also, so you get the fixes for some of the occupancy models. 
- A minor bug was fixed in function - merge.occasion.datathat created duplicate row names and prevented the design data from being used in a model.
- Thirteen different occupancy models were added. Models in the following list use the designation from MARK: - Occupancy, OccupHet, RDOccupEG, RDOccupPE, RDOccupPG, RDOccupHetEG, RDOccupHetPE, RDOccupHetPG,OccupRNPoisson,- OccupRNNegBin,OccupRPoisson,OccupRNegBin,MSOccupancy.- Hetmeans it uses the Pledger mixture and those with- RDare the robust design models. The 2 letter designations for the RD models are shorthand for the parameters that are estimated. For- EG, Psi, Epsilon, and Gamma are estimated, for- PEgamma is dropped and for- PG, Epsilon is dropped. For the latter 2 models, Psi can be estimated for each primary occasion. The last 5 models include the Royle/Nichols count (RPoisson) and presence (RNPoisson) models and the multi-state occupancy model. See- salamanderfor an example of- Occupancy, OccupHet,- Donovan.7for an example of- OccupRNPoisson,OccupRNNegBin,- Donovan.8for an example of- OccupRPoisson,OccupRNegBin, see- RDSalamanderfor an example of the robust design models and- NicholsMSOccupancyfor an example of- MSOccupancy.- salamanderdata.
- The functions - create.model.listand- mark.wrapperwere modified so that a list of parameters can be used to loop. This is useful in the situation with shared parameters such as- p1and- p2in the- MSOccupancymodel,- closedmodels etc. See- p1.p2.different.dotin- NicholsMSOccupancyfor an example. It can also be useful if the model definitions are linked conceptually (e.g., when one parameter is time dependent, the other should also be time dependent).
- The "." value in an encounter history is now acceptable to RMark and gets passed to MARK for interpretation as a missing value. 
-  print.marklistwas fixed to show the model table properly after a c-hat adjustment was made. The change in the code in version 1.6.5 to add parameter specific values to the model table had the side-effect of dropping the model name if c-hat was adjusted.
Version 1.7.0 (7 Nov 2007)
- A function - deltamethod.specialfor computation of delta method variances of some special functions was added. It uses the function- deltamethodfrom the package- msm. You need to install the package- msmfrom CRAN to use it.
- A more complete example ( - mallard) created by Jay Rotella was added for the nest survival model. His script provides a nice tutorial for RMark and the utility of R to provide a wide-open capability to calculate/plot etc with the results. It also demonstrates the advantages of scripting in R to document your analysis and enable it to be repeated. Before you use his tutorial you need to install the package plotrix from CRAN. At a later date, Jay has said he will add some additional examples to demonstrate use of the- deltamethodfunction to create variances for functions of the results from MARK.
- Various changes were made to help files. A more complete description of - cleanupwas given to tie into- mallardexample.
Version 1.6.9 (10 Oct 2007)
- 
Nest survival model was added to list of MARK models supported by RMark. See killdeerfor an example. Note that the data structure for nest models is completely different from the standard capture history so the functionsimport.chdata,export.chdataandconvert.inpdo not work with nest data structure.
- Slight change was made to - run.mark.modeland- print.markto accomodate change in R 2.6.0.
Version 1.6.8 (2 Oct 2007)
- Changes were made to - merge.occasion.datato enable group and time-specific design covariates to be added to the design data.
- Change was made to - setup.parametersto use a log-link for N in the closed-capture models. MARK forces that link for N but the change was needed for- model.averagewhich does the inverse-link computation. Note that the reported N in- model.averageis actually f0 (number not seen). To get the correct values for N simply add M_t+1 (unique number captured) to f0. That is the way MARK computes N. The std error and confidence interval is on f0 such that the lower ci on N will never be less than M_t+1.
- An error was fixed in the output of - model.average. When you selected a specific parameter, it was giving a UCL which was a copy from one of the models and not the UCL from the model averaging. If you didn't specify- vcv=Tit only showed the errant UCL and if you did specify- vcv=Tthen it showed the correct LCL and UCL but then added the errant UCL in a column. This occurred because it was adding covariate data for the specific parameter and was shifted a column because of a change in 1.6.1.
Version 1.6.7 (7 Aug 2007)
- Changes were made in - print.mark,- print.summary.markand- compute.design.datato acommodate changes in V2.5.1 of R. When upgrading versions of R problems may occur if RMark was built with an earlier version of R. The version of R that was used to build RMark is listed on the screen each time it is loaded with library(RMark)- This is RMark 1.6.7 Built: R 2.5.1; i386-pc-mingw32; 2007-08-07 09:00:33; windows
- The help file for - import.chdatawas expanded to clarify the differences between it and- convert.inpand the use of the- freqfield.
Version 1.6.6 (14 May 2007)
- Function - make.mark.modelwas fixed so that the real label indices were properly written when- simplify=FALSEis used.
- Function - make.mark.modelwas also changed to remove the parameter simplification for- mlogitparameters that was added in v1.4.5. I mistakenly assumed that the- mlogitparameters were setup such that the normalization to sum to 1 was done will all the real parameters in the set (i.e., all PSI for a single stratum). In fact, the- mlogitvalues are only specified for the unique real parameters so if there is any simplification and the sum of the probabilities is close to 1 (excluding subtraction value) the values will not be properly constrained. For example, with the- mstratadata if the problem was constrained such that PSI from AtoB was equal to AtoC, it is still necessary to have these as separate real parameters and constrain them with the design matrix. As it turns out, with the- mstrataexample it does not matter because the problem is such that the sum of Psi for AtoB and AtoC is not close to 1 (same for other strata) and any link will work. This change will only be noticeable in situations in which the constraint matters (i.e., the probability for the subtraction parameter is near 0). The change back to non-simplification for- mlogitparameters may increase execution times because the design matrix size has been increased. Previous users of the Multistrata design will see very little difference in there results if they only used models containing stratum:tostratum because that will create an all-different PIM within each- mlogitset. When I ran the- mstrataexamples with this version and compared them to v1.6.5 the results were different but they were differences in the 5th or smaller decimal point due to differences in numerical optimization.
Version 1.6.5 ( 3 May 2007)
- Function - model.tablewas modified to include parameter formula fields in the- model.tabledataframe of a- marklist. Previously only the- model.namewas included which is a concatenation of the individual parameter formulas. The additional fields allows extracting the model table results based on one of the parameter formulas or to create a matrix of model AICc or other values with rows as one parameter and colums as the other. See- model.tablefor an example.
- Function - process.datawas modified such that factor variables used for grouping retain the ordering of the factor levels in the data file. Previously they would revert back to default ordering and the re-leveling would also have to be repeated on the design data also.
- An argument - briefwas added to- markto control amount of summary output.
- Fixed a bug in - get.realthat prevented computation for models without the stored covariate values.
- Added code to - make.mark.modelthat prevents constructing models with empty rows in the design matrix unless the parameter is fixed. For example, if you were to try ~-1+Time for the dipper data, it will fail now because there is no value for the intercept (Time=0).
- Function - mark.wrapperoutputs the model name to the screen before running the model which helps associate any error messages to the model if- output=F.
Version 1.6.4 (7 March 2007)
- A new function - export.modelwas created to copy the output files into the naming convention needed to append them into a MARK .dbf/.fpt database so they can be used with the MARK interface features. This is useful to be able to use some of the features not contained in RMark such as median c-hat and variance component estimation. To create a MARK database, first use- export.chdatato create a .inp file to pass the data into MARK. Start MARK and use File/New to create a new database. Select the appropriate Data Type (model in RMark) and fill in the appropriate values for encounter occasions etc. For the Encounter Histories File Name, select the file you created with- export.chdata. Once you have created the database in the Program MARK interface, click on the Browse menu item and then Output/Append and select the output file(s) (i.e those with a Y.tmp) that you exported with- export.model. Note that this will not work with output files run with versions of RMark prior to this one because the MARK interface will give a parse error for the design matrix. To get around that you can edit the output file and remove the spaces in the line with the design matrix header. For example, it should look as follows- design matrix constraints=7 covariates=7without spaces around the = sign.
- The minor change described above was made in the input file with spacing on the design matrix line to enable proper appending of the output into a MARK .dbf/.fpt database. 
- The function - cleanupwas modified to delete all mark*.tmp files. Do not use- cleanupuntil you have appended any exported models.
- An argument - use.commentswas added to- import.chdatato enable comment fields to be used as row names in the data frame. A comment is indicated as in MARK with /* comment */. They can be anywhere in the record but they must be unique and they can not have a column header (field name).
- Function - create.model.listwas modified such that it only includes lists with a- formulaelement. This prevents collecting other objects that are named similarly but are not model defintions.
Version 1.6.3 (5 March 2007)
- A minor change in - make.mark.modeland- find.covariateswas made to accommodate use of the same covariate in different formulas (e.g. Phi and p). Previous code worked except any call to- get.realwould fail. Previously a duplicate of the covariate was entered in the data file to MARK. Now only a single copy is passed.
- An argument - defaulthas been added to the model definition (- parametersin- make.mark.modeland- model.parametersin- mark). The argument sets the default value for parameters represented by design data that have been deleted.
- Checks were added in - make.mark.modelto fail if any of the individual covariates used are either factor variables or contain NAs. Both could fail in the MARK.EXE run but the error message would be less obvious. Factor variables can work as an individual covariate, if the levels are numeric. But it was easier to exclude all factor variables from being individual covariates. They can easily be converted to a continuous version (e.g. Blackduck$BirdAge=as.numeric(Blackduck$BirdAge)-1). The code for the- Blackduckwas changed to make BirdAge a continuous rather than factor variable. Factor variables can still be used to define groups and then used in the formula. They just can't be used as individual covariates. This change was made because a factor variable was in the data but not defined in- groupsand when it was used in the formula it would create a float error in MARK.EXE and that would be confusing and hard to track down.
Version 1.6.2 (28 Feb 2007)
- The fix in 1.6.1 to avoid the incorrect design matrix was not sufficiently general and created a parse error in R if you attempted to use any design data covariates that were created with a cut function to create factor variables by binning a variable. This has been corrected in this version. 
- The code in - read.mark.binaryhas been changed to skip over the v-c matrix for the derived parameters if it is not found in the file. This was causing an error with the PRADREC model type.
Version 1.6.1 (17 Jan 2007)
- An important bug was fixed in - make.mark.modelin which an incorrect design matrix would be created if you used two individual covariates in the same formula whereby one of the covariate names was contained within the other. For example, if you used ~mass+mass2 where mass2=mass^2, it would actually create a design matrix with columns mass product(mass,mass2) which would be the model mass+mass^3. This happened due to the way the code identified columns where it needed to replace dummy values with individual covariate names. Since mass was contained in mass2 it added mass to the column as a product. The code now does exact matching so the error can no longer occur.
- An argument - indiceswas added to the function- model.averagewhich enables restricting the model averaging to a specific set of parameters as identified by the all-different parameter indices. This is most useful in large models with many different indices such that memory limitations are encountered in constructing the variance-covariance matrix of the real parameters. For example, with a CJS analysis of data with 18 groups and 26 years of data, the number of parameter indices exceeds 22,000. Even by restricting the parameters to either Phi or p with the- parameterargument there are still 11,000 which would attempt to create a matrix containing 11,000 x 11,000 elements which can exceed the memory limit. In most cases, there are far fewer unique parameters and this argument allows you to select which parameters to average.
- Time-varying covariates are no longer needed for all times if the formula is correctly written to exclude them in the resulting design matrix. - make.mark.modelstill reports missing time-varying covariates but will continue to try and fit the model but if the missing variables are used in the design matrix the model will fail. As an example consider a time varying covariate x for recapture times 1990 to 1995. The code expects to find variables x1990, x1991, x1992, x1993, x1994, x1995. However, lets say that the values are only known for 1993-1995. If you define a variable I'll call recap in the design data which has a value 1 for 1993-1995 and a value 0 for 1990-1992 then if you use the formula ~recap:x the resulting design matrix will only use the known variables for 1993-1995 but you will still be warned that the other values (x1990 - x1992) are missing.
- A bug was fixed in - extract.mark.outputwhich prevented it from obtaining more than the last mean covariate value from the MARK output.
-  fill.covariateswas modified such that only a partial list of covariate values need to be specified withdataand the remainder are filled in with default values depending on argumentusemean.
- The output from - summary.markwas modified for real parameters when- se=Tto include- all.diff.indexto provide the indices of each real parameter in the all-different PIM structure. They are useful to restrict- covariate.predictionsand- model.averageto a specific set of real parameters.
- A new function - covariate.predictionswas created to compute real parameter values for multiple covariate values and their variance-covariance matrix. It will also model average those values if a marklist is passed to the function. Two examples from chapter 12 of Cooch and White are provided to give examples of models with individual covariates and the use of this function.
- The default value of - vcvin- model.averagehas been changed to- FALSE.
Version 1.6.0 (27 Nov 2006)
- A bug was fixed in - PIMSwhich prevented it from working with Multistrata models.
- Bugs were fixed in - make.design.datawhich prevented use of argument- remove.unused=Twith Multistrata models and also for any type of model when there were no grouping variables.
- Bugs were fixed in - process.datawhich gave incorrect ordering of intial ages if the factor variable for the age group was numeric and more than two digits. Also, the number of groups in the data was not correct if the number of loss on capture records exceeded the number without loss on capture within a group.
- Bugs were fixed in - setup.parametersand- setup.modelthat prevented use of the Barker model and that reported an erroneous list of model names when an incorrect type of model was selected.
Version 1.5.9 (26 June 2006)
- A bug was fixed in - convert.inpwhich prevented the code from working with groups and two or more covariates. Note that there are limitations to this function which may require some minor editing of the file. The limitations have been added to the help file (- convert.inp).
Version 1.5.8 (22 June 2006)
- Argument - optionswas added to- markand- make.mark.modelwith a default NULL value. It is simply a character string that is tacked onto the- Proc Estimatestatement for the MARK .inp file. It can be used to request options such as NoStandDM (to not standardize the design matrix) or SIMANNEAL (to request use of the simulated annealing optimization method) or any existing or new options that can be set on the estimate proc.
- A bug in - model.tablewas fixed so it would accomodate the change from v1.3 to a marklist in which the model.table was switched to the last entry in the list.
- A bug in - summary.markwas fixed so it would properly display QAICc when chat > 1.
- Function - adjust.chatwas modified such that it returns a marklist with each model having a new chat value and the model.table is adjusted for the new chat value.
- Function - adjust.parameter.countwas modfied so it returns the mark model object rather than using eval to modify the object in place. The latter does not work with models in a marklist and calls made within functions.
Version 1.5.7 (8 June 2006)
- 
Argument datawas added to functionmodel.averageto enable model averaging parameters at specific covariate values rather than the mean value of the observed data. An example is given in the help file.
- Argument - parameterof function- model.averagenow has a default of NULL and if it is not specified then all of the real parameters are model averaged rather than those for a particular type of parameter (eg p or Psi).
- A bug was fixed in function - compute.realthat caused the function to fail for computations of- Psi.
Version 1.5.6 (6 June 2006)
- 
print.summary.markwas modified so fixed parameters are noted.
- 
Argument show.fixedwas added tosummary.markto control whether fixed parameters are shown as NA (FALSE) or as the value at which they were fixed. Ifse=Tthe default isshow.fixed=Totherwiseshow.fixed=F. The latter is most useful in displaying values in PIM format (without std errors), so fixed values are displayed as blanks instead of NA.
- Argument - linkswas added to- convert.link.to.realand the default value for argument- modelis now NULL. One or the other must be given. If the value for- linksis given then they are used in place of the links specified in the- modelobject. This provides for additional flexibility in changing link values for computation (eg use of log with mlogit).
- 
Argument dropwas added tomodel.average. Ifdrop=TRUE(the default), then any model with one or more non-positive (0 or negative) variances is not used in the model averaging.
- An error in computation of the v-c matrix of mlogit link values in - compute.links.from.realswas fixed. This did not affect confidence intervals for real parameters (eg Psi) in- model.averagebecause it uses the logit transformation for confidence intervals on real parameters that use mlogit link (eg Psi).
-  get.realwas unable to extract a single parameter value(eg constant Phi model). This was fixed.
- The argument - parm.indiceswas removed from the functions- compute.realand- convert.link.to.realbecause the subsetting can be done easily with the complete results returned by the functions. This changed the examples in- fill.covariates.
-  compute.realand subsequentlyget.realreturn a fieldfixedwhen se=TRUE that denotes whether a real parameter is a fixed parameter or an estimated parameter at a boundary which is identified by having a standard error=0.
Version 1.5.5 (1 June 2006)
-  modelhas been deleted from the arguments in TransitionMatrix. It was only being used to ascertain whether the model was a Multistrata model. This is now determined more accurately by looking for the presence oftostratumin the argumentxwhich is a dataframe created forPsifrom the functionget.real. The function also works with the estimates dataframe generated frommodel.average. See help forTransitionMatrixfor an example.
- An argument - vcvwas added to function- model.average. If the argument is TRUE (the default value) then the var-cov matrix of the model averaged real parameters is computed and returned and the confidence intervals for the model averaged parameters are constructed. Models with non-positive variances for betas are reported and dropped from model averaging and the weights are renormalized for the remaining models.
- A new function - compute.links.from.realswas added to the library to transform real parameters to its link space. It has 2 functions both related to model averaged estimates. Firstly, it is used to transform model averaged estimates so the normal confidence interval can be constructed on the link values and then back-transformed to real space. The second function is to enable parametric bootstrapping in which the error distbution is assumed to be multivariate normal for the link values. From a single model, the link values are easily constructed from the betas and design matrix so this function is not needed. But for model averaging there is no equivalent because the real parameters are averaged over a variety of models with the same real parameter structure but differing design structures. This function allows for link values and their var-cov matrix to be created from the model averaged real estimates.
Version 1.5.4 (30 May 2006)
- In function - markan argument- retrywas added to enable the analysis to be re-run up to the number of times specified. An analysis is only re-run if there are "singular" beta parameters which means that they are either non-estimable (confounded) or they are at a boundary. Beginning with this version,- extract.mark.outputwas modified such that the singular parameters identified by MARK are extracted from the output (if any) and the indices for the beta parameters are stored in the list element- model$results$singular. The default value for- retryis 0 which means it will not retry. When the model is re-run the initial values are set to the values at the completion of the last run except for the "singular" parameters which are set to 0. Using- retrywill not help if the parameters are non-estimable. However, if the parameters are at a boundary because the optimization "converged" to a sub-optimal set of parameters, then setting- retryto 1 or a suitably small value will often help it find the MLEs by moving away from the boundary. If the parameters are estimable and setting- retrydoes not work, then it may be better to set new initial parameters by either specifying their values or using a model with similar parameters that did converge.
- A new function - rerun.markwas created to simplify the process of refitting models with new starting values when the models were initially created with- mark.wrapperwhich runs a list of models by using all combinations of the formulas defined for the various parameters in the model. Thus, individual calls to- markare not constructed by the user and re-running an analysis from the resulting list would require constructing those calls. The argument- model.parametersis now stored in the model object and it is used by this new function to avoid constructing calls to rerun the analysis. With this new function you only need to specify the model list element to be refitted, the processed dataframe, the design data and the model list element (or different model) to be used for initial values. See- rerun.markfor an example.
- To make - rerun.marka viable approach for all circumstances, the functions- mark.wrapperand- model.tablewere modified such that models that fail to converge at the outset (i.e., does not provide estimates in the output file) are stored in the model list created by the former function and they are reported as models that did not run and are skipped in the model.table by the latter function. This enables a failed model to be reanalyzed with- rerun.markusing another model that converged for starting values.
Version 1.5.3 (25 May 2006)
- In function - get.reala fix was made to accommodate constant pims and a warning is given if the v-c matrix for the betas has non-positive variances.
- In function - make.mark.model, the argument- initialcan now be a single value which is then assigned as the initial value for all betas. I have found this useful for POPAN models. For some models I have run, the models fail to converge in MARK with the default initial values it uses (I believe it uses- initial=0). I have had better luck using- initial=1. By allowing the use of a single value you can use the same generic starting value for each model without figuring out the number of betas in each model. Also note that you can specify another model that has already been run to use as initial values for a new model and it will match parameter values.
- A bad bug was fixed in - cleanupwhich was unfortunately deleting files containing "out", "inp", "res" or "vcv" rather than those having these as extensions. This happened without your knowledge if you chose ask=FALSE. Good thing I had a backup. Anyhow, I have now restricted it to files that are named by RMark with markxxxx.inp etc where xxxx is a numeric value. Thus if you assign your own basefile name for output files you'll have to delete them manually. Better safe than sorry.
Version 1.5.2 (18 May 2006)
- Two new functions were added in this version. - convert.inpconverts a MARK encounter history input file to an RMark dataframe. This will be particularly useful for those folks who have already been using MARK. Instead of converting and importing their data with- import.chdatathey can use the- convert.inpto import their .inp file directly. It can also be used to directly import any of the example .inp files that accompany MARK and the MARK electronic book (http://www.phidot.org/software/mark/docs/book/). The second new function is only useful for tutorials and for first time users trying to understand the way RMark works. The function- PIMSdisplays the full PIM structure or the simplified PIM structure for a parameter in a model. The user does not directly manipulate PIMS in RMark and they are essentially transparent to the user but for those with MARK experience being able to look at the PIMS may help with the transition.
Version 1.5.1 (11 May 2006)
- Functions - compute.linkand- get.linkwere added to compute link values rather than the parameter estimates.
- A function - convert.link.to.realwas added to convert link estimates to real parameter estimates. Previously a similar internal function was used within- compute.realbut to provide more flexibility it was put into a separate function.
- An argument - betawas added to- get.realto enable it to be changed in the computation of the real parameters rather than always using the values in- results$beta.
- A function - TransitionMatrixwas added to create a transition matrix for the Psi values. It is provided for all strata including the- subtract.stratum. Standard errors and confidence intervals can also be returned.
-  make.mark.modelwas modified to includetime.intervalsas an element in the mark object.
Version 1.5.0 (9 May 2006)
- If output file already exists user is given option to create mark model from existing files. Only really useful if a bug occurs (which occurred to me from 1.4.9 changes) and once fixed any models already run can be brought into R by running the same model over and specifying the existing base - filename. Base- filenamevalues are no longer prefixed with MRK to enable this change.
- On occasion MARK will complete the analysis but fail to create the v-c matrix and v-c file. The code has been modified to skip over the file if it is missing and output a warning. 
- Two new functions have been added to ease handling of marklist objects. - merge.markmerges an unspecified number of marklist and mark model objects into a new marklist with an optional model.table.- remove.markcan be used to remove mark models from a marklist. See- dipperfor examples of each function.
- Various changes were made to functions that compute real parameter estimates, their standard errors, confidence intervals and variance-covariance matrix. The functions that were changed include - compute.real,- find.covariates,- get.real,- fill.covariates. For examples, see help for latter two functions.
Version 1.4.9 (3 May 2006)
- Argument - initialof- make.mark.modelwas not working after model simplification was added in v1.2. This was modified to select initial values from the model based on names of design matrix columns rather than column contents which have different numbers of rows depending on the simplification.
- 
extract.mark.outputwas fixed to extract the correct -2LnL from the output file in situations in which initial values were specified.
Version 1.4.8 (25 April 2006)
- Argument - silentwas added to- markand- mark.wrapperwith a default value of- FALSE. This overcomes the problem described above in 1.4.7.
- Code was added to - collect.model.namesto prevent it from tripping up when files contain an asterisk which R uses for special names.
- Use of T and F was properly changed to TRUE and FALSE in various functions to prevent errors when T or F are R objects. 
- Code for naming files was modified to avoid problems when more than 999 analyses were run in the same directory. 
- Bug in setting fixed parameters with argument - fixed=list(index=,value=)was corrected.
- Argument - remove.interceptwas added to parameter definition to force removal of intercept in designs with nested factor interactions with additional factor variables (e.g.,- Psi=list(formula=~sex+stratum:tostratum,remove.intercept=TRUE)).
Version 1.4.7 (10 April 2006)
- An error was fixed in the - Psisimplification code. Note that with the fix in 1.4.2 to trap errors, a side effect is that non-trapped errors that occur in the R code will now fail without any error messages. If the error occurs in making the model, then the model will not be run, but you will not receive a message that the model failed. I may have to make the error trapping a user-settable option to provide better error tracking.
Version 1.4.6 (7 April 2006)
- Assurance code was added to test that the mlogits were properly assigned. An error message will be given if there has been any unforeseen problem created by the simplification. This eliminates any need for the user to check them as described under 1.4.5 above. 
Version 1.4.5 (6 April 2006)
- For multistrata models, the code for creating the mlogit links for Psi was not working properly if there was more than one group. This was fixed in this version. 
- 
Simplification of the PIMS has now been extended to include mlogit parameters. That was not a trivial exercise and while I feel confident it is correct, double check the assignment of mlogit links for complex models, as I have not checked many examples at present. Within a stratum, the corresponding elements for Psi for each of the tostratum (movement from stratum to each of the other strata excluding the subtract.stratum) should have the same mlogit(xxx) value such that it can properly compute the value forsubtract.stratumby subtraction.
Version 1.4.4 (4 April 2006)
- By including the test on model failure, errors that would stop program were not being displayed. This has been fixed in this version. 
- An error was fixed in using time-varying covariates when some of the design data had been deleted. 
Version 1.4.3 (30 March 2006)
- Problem with pop up window has been fixed. It will no longer appear if the model does not converge but the model will show as having failed. 
- An error was fixed in extracting output from the MARK output file when for some circumstances the label for beta parameters included spaces. This now works properly. 
Version 1.4.2 (14 March 2006)
- Errors in the FORTRAN code were preventing completion of large batch jobs. Now these errors are caught and models that fail are reported and skipped over. Unfortunately, it does require user intervention to close the popup window. Make sure you select Yes to close the window especially if you use the default - invisible=FALSEsuch that the window does not appear. If you select No, you will not able to close the window and R will hang.
- A new list element was added to - parametersin- make.design.datafor parameters such as- Psito set the value of- tostratumthat is computed by subtraction. The default is to compute the probabilitity of remaining in the stratum. The following is an example with strata A to D and setting A to be computed by subtraction for each stratum:
 ddl=make.design.data(data.processed,
 parameters=list(Psi=list(pim.type="constant",subtract.stratum=c("A","A","A","A")),
 p=list(pim.type="constant"),S=list(pim.type="constant")))
Version 1.4.1 (11 March 2006)
- A value "constant" was added for the argument - pim.type. Note that- pim.typeis only used for triangular PIMS. See- make.design.data
- Some code changes were made to - make.mark.modelwhich lessen time to create the MARK input file for large models.
- Function - add.design.datawas modified to accomodate robust design and deletion of design data; this was missed in v1.4 changes.
- 
model.nameargument inmarkandmake.mark.modelwas not working. This was fixed.
Version 1.4 (9 March 2006)
- Robust design models added. See - robustfor an example.
- Function - cleanupwas modified so warning messages/errors do not occur if no models/files are found.
- Parameters in the design matrix are now ordered in the same consistent arrangement. In prior versions they were arranged based on their order in the argument call. 
- Argument - rightwas added to- make.design.data,- add.design.dataand in- design.parametersof- make.mark.modelto control whether bins are inclusive on the right (default). The- robustexample uses this argument in a call to- mark.
Version 1.3 (22 Feb 2006)
- Time varying covariates can now be included in the model formula. See - make.mark.modelfor details.
- New model types for Known (Known-fate) and Multistrata (CJS with different strata) were added. See - Blackduckand- mstratafor examples.
- Specific rows of the design data can now be removed for parameters that should not be estimated. Default fixed values can be assigned. The function - make.design.datanow accepts an argument- remove.unusedwhich can be used to automatically remove unused design data for nested models. It's behavior is also determined by the new argument- default.fixedin- make.mark.model.
-  summary.marknow produces a summary object andprint.summary.markprints the summary object. Changes were made to output whense=T.
- A new function - merge.occasion.datawas created to add occasion specific covariates to the design data.
- New functions - mark.wrapperand- create.model.listwere created to automate running models from a set of model specifications for each model parameter.
- The argument - begin.timein- process.datacan now be a vector to enable a different beginning time for each group.
- An argument - pim.typewas added to parameter specification to enable using pims with time structure for data sets with a single release cohort for CJS. See- make.design.data
- Model lists created with - collect.modelsare now given the class "marklist" which is used with- cleanupand- print.marklist(see- print.mark).
- The function - collect.modelsnow places the model.table at the end of the returned list such that each model number in model.table is now the element number in the returned list. Previously it was 1+ that number.
- Input, output, v-c and residual results files from MARK are now stored in the directory containing the - .Rdataworkspace. They are numbered consecutively and the field- outputcontains the base filename. The function- cleanupwas created to delete files that are no longer linked to- markor- marklistobjects.
- Model averaged estimates and standard errors of real parameters can be obtained with the function - model.average.
Version 1.2 (4 Oct 2005)
- By default the PIM structure is simplified to use the fewest number of unique parameters. This reduces the size of the design matrix and should reduce run times. 
- The above change was made in some versions still numbered 1.1, but it contained an error that caused the links command for MARK to be constructed incorrectly. 
-  adjustargument has been added tocollect.modelsto enable control of number of parameters and resulting AIC values.
-  model.listinmodel.tableandadjust.chatcan now also be a list of models created bycollect.modelswhich allows operating on sets of models.
Author(s)
Jeff Laake
Add design data
Description
Creates new design data fields in (ddl) that bin the fields
cohort, age or time. Other fields (e.g., effort value
for time) can be added to ddl with R commands. Also see 
merge_design.covariates to merge data like environmental variables
or any other data that is common to all animals or group/time specific values.
Usage
add.design.data(
  data,
  ddl,
  parameter,
  type = "age",
  bins = NULL,
  name = NULL,
  replace = FALSE,
  right = TRUE
)
Arguments
| data | processed data list resulting from  | 
| ddl | current design dataframe initially created with
 | 
| parameter | name of model parameter (e.g., "Phi" for CJS models) | 
| type | either "age", "time" or "cohort" | 
| bins | bins for grouping | 
| name | name assigned to variable in design data | 
| replace | if TRUE, replace any variable with same name as  | 
| right | If TRUE, bin intervals are closed on the right | 
Details
Design data can be added to the parameter specific design dataframes with R
commands and this function does NOT have to be used to add design data. However,
often the additional fields will be functions of cohort,
age or time.  add.design.data provides an easy way to
add fields that bin (put into intervals) the original values of
cohort, age or time.  For example, age may have
levels from 0 to 10 which means the formula ~age will have 11
parameters, one for each level of the factor.  It might be more desirable
and more parsimonious to have a simpler 2 age class model of young and
adults.  This can be done easily by adding a new design data field that bins
age into 2 intervals (age 0 and 1+) as in the following example:
ddl=make.design.data(proc.example.data) ddl=add.design.data(proc.example.data,ddl,parameter="Phi",type="age", bins=c(0,.5,10),name="2ages")
By default, the bins are open on the left and closed on the right (i.e.,
binning x by (x1,x2] is equivalent to x1<x<=x2) except for the first
interval which is closed on the left.  Thus, for the above example, the age
bins are [0,.5] and (.5,10].  Since the ages in the example are 0,1,2...
using any value >0 and <1 in place of 0.5 would bin the ages into 2 classes
of 0 and 1+.  This behavior can be modified by changing the argument
right=FALSE to create an interval that is closed on the left and open on the
right.  In some cases this can make reading the values of the levels
somewhat easier. It is important to recognize that the new variable is only
added to the design data for the defined parameter and can only be
used in model formula for that parameter.  Multiple calls to
add.design.data can be used to add the same or different fields for
the various parameters in the model. For example, the same 2 age class
variable can be added to the design data for p with the command:
ddl=add.design.data(proc.example.data,ddl,parameter="p",type="age", bins=c(0,.5,10),name="2ages")
The name must be unique within the parameter design data, so they
should not use pre-defined values of group, age, Age, time, Time,
cohort, Cohort.  If you choose a name that already exists in the
design data for the parameter, it will not be added but it can
replace the variable if replace=TRUE. For example, the 2ages
variable can be re-defined to use 0-1 and 2+ with the command:
ddl=add.design.data(proc.example.data,ddl,parameter="Phi",type="age", bins=c(0,1,10),name="2ages",replace=TRUE)
Keep in mind that design data are stored with the mark model object
so if a variable is redefined, as above, this could become confusing if some
models have already been constructed using a different definition for the
variable. The model formula and names would appear to be identical but they
would have a different model structure.  The difference would be apparent if
you examined the design data and design matrix of the model object but would
the difference would be transparent based on the model names and formula.
Thus, it would be best to avoid constructing models from design data fields
with different structures but the same name.
Value
Design data list with new field added for the specified parameter.
See make.design.data for a description of the list structure.
Note
For the specific case of "closed" capture models, the parameters
p (capture probability) and c (recapture probability) can be
treated in a special fashion.  Because they really the same type of
parameter, it is useful to be able to share a common model structure (i.e.,
same columns in the design matrix). This is indicated with the
share=TRUE element in the model description for p.  If the
parameters are shared then the additional covariate c is added to the
design data, which is c=0 for parameter p and c=1 for
parameter c.  This enables an additive model to be developed where
recapture probabilities mimic the pattern in capture probabilities except
for an additive constant.  The covariate c can only be used in the
model for p if share=TRUE. If the latter is not set using
c in a formula will result in an error.  Likewise, if
share=TRUE, then the design data for p and c must be
the same because the design data are merged in constructing the design
matrix.  Thus if you add design data for parameter p, you should add
a similar field for parameter c if you intend to fit shared models
for the two parameters.  If the design data do not match and you try to fit
a shared model, an error will result.
Author(s)
Jeff Laake
See Also
make.design.data, process.data
Examples
# This example is excluded from testing to reduce package check time
data(example.data)
example.data.proc=process.data(example.data)
ddl=make.design.data(example.data.proc)
ddl=add.design.data(example.data.proc,ddl,parameter="Phi",type="age",
  bins=c(0,.5,10),name="2ages")
ddl=add.design.data(example.data.proc,ddl,parameter="p",type="age",
bins=c(0,.5,10),name="2ages")
ddl=add.design.data(example.data.proc,ddl,parameter="Phi",type="age",
bins=c(0,1,10),name="2ages",replace=TRUE)
Adjust count of estimated parameters
Description
Modifies number of estimated parameters and the resulting AICc value for model selection.
Usage
adjust.parameter.count(model, npar)
Arguments
| model | MARK model object | 
| npar | Value of count of estimated parameters | 
Details
When a model is run the parameter count determined by MARK is stored in
results$npar and the AICc value is stored in results$AICc.  If
the argument adjust is set to TRUE in the call to
run.mark.model and MARK determined that the design matrix was
not full rank (i.e., the parameter count is less than the columns of the
design matrix), then the parameter count from MARK is stored in
results$npar.unadjusted and AICc in results$AICc.unadjusted
and results$npar is set to the number of columns of the design matrix
and results$AICc uses the assumed full rank value of npar.
This function allows the parameter count to be reset to any value less than
or equal to the number of columns in the design matrix.  If
results$npar.unadjusted exists it is kept as is.  If it doesn't
exist, then the current values of results$npar and
results$AICc are stored in the .unadjusted fields to maintain
the values from MARK, and the new adjusted values defined by the function
argument npar are stored in results$npar and
results$AICc. In the example below, the CJS model Phi(t)p(t) is
fitted with the call to mark which defaults to
adjust=TRUE.  This is used to show how adjust.parameter.count
can be used to adjust the count to 11 from the full rank count of 12.
Alternatively, the argument adjust=FALSE can be added to prevent the
adjustment which is appropriate in this case because Phi(6) and p(6) are
confounded.
Value
model: the mark model object with the adjustments made
Author(s)
Jeff Laake
See Also
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
ptime=list(formula=~time)
Phitime=list(formula=~time)
dipper.phitime.ptime=mark(dipper,model.parameters=list(Phi=Phitime, p=ptime),delete=TRUE)
dipper.phitime.ptime=adjust.parameter.count(dipper.phitime.ptime,11)
dipper.phitime.ptime=mark(dipper,model.parameters=list(Phi=Phitime, p=ptime),
                           adjust=FALSE,delete=TRUE)
Adjust over-dispersion scale or a result value such as effective sample size
Description
Adjust value of over-dispersion constant or another result value for a collection of models which modifies model selection criterion and estimated standard errors.
Usage
adjust.value(field="n",value,model.list)
       adjust.chat(chat=1,model.list)
Arguments
| field | Character string containing name of the field; either
 | 
| value | new value for field | 
| model.list | marklist created by the function
 | 
| chat | Over-dispersion scale | 
Details
The value of chat is stored with the model object except when there
is no over-dispersion (chat=1). This function assigns a new value of
chat for the collection of models specified by model.list
and/or type.  The value of chat is used by
model.table for model selection in computing QAICc unless
chat=1.  It is also used in summary.mark,
get.real and compute.real to adjust standard
errors and confidence intervals. Note that the standard errors and
confidence intervals in results$beta,results$beta.vcv
results$real, results$derived and results$derived.vcv
are not modified and always assume chat=1.
It can also be used to modify a field in model$results such as
n which is ESS (effective sample size) from MARK output that is used
in AICc/QAICc calculations.
Value
model.list with all models given the new chat value and model.table adjusted for chat values
Note
See note in collect.models
Author(s)
Jeff Laake
See Also
model.table, summary.mark,
get.real ,compute.real
Examples
#
# The following are examples only to demonstrate selecting different 
# model sets for adjusting chat and showing model selection table. 
# It is not a realistic analysis.
#
# This example is excluded from testing to reduce package check time
data(dipper)
do_example=function()
{
mod1=mark(dipper,delete=TRUE)
mod2=mark(dipper,model.parameters=list(Phi=list(formula=~time)),delete=TRUE)
mod3=mark(dipper,model="POPAN",initial=1,delete=TRUE)
cjs.results=collect.models(type="CJS")
cjs.results  # show model selection results for "CJS" models
}
cjs.results=do_example()
cjs.results
# adjust chat for all models to 2
cjs.results=adjust.chat(2,cjs.results) 
cjs.results
San Luis Valley mallard data
Description
A recovery data set for mallards in San Luis Valley Colorado
Format
A data frame with 108 observations on the following 5 variables.
- ch
- a character vector containing the encounter history of each bird 
- freq
- frequency of that encounter history 
- ReleaseAge
- the age of the bird when it was released 
Details
This is a data set that accompanies program MARK as an example for Brownie and Seber recovery model. In those input files it is in a summarized format. Here it is in the LD encounter history format. The data can be stratified using ReleaseAge (Adult, Young) as a grouping variable.
Note that in the MARK example the variable is named Age. In the R code, the fields "age" and "Age" have specific meanings in the design data related to time since release. These will override the use of a field with the same name in the individual covariate data, so the names "time", "Time", "cohort", "Cohort", "age", and "Age" should not be used in the individual covariate data with possibly the exception of "cohort" which is not defined for models with "Square" PIMS such as POPAN and other Jolly-Seber type models.
Examples
# brownie=import.chdata("brownie.inp",field.types=c("n","f"))
# This example is excluded from testing to reduce package check time
 data(brownie)
# default ordering of ReleaseAge is alphabetic so it is 
# Adult, Young which is why initial.ages=c(1,0)
# Seber Recovery
br=process.data(brownie,model="Recovery",groups="ReleaseAge",age.var=1,initial.ages=c(1,0))
br.ddl=make.design.data(br,parameters=list(S=list(age.bins=c(0,1,10)),
                                           r=list(age.bins=c(0,1,10))),right=FALSE)
mod=mark(br,br.ddl,model.parameters=list(S=list(formula=~-1+age:time,link="sin"),
                                           r=list(formula=~-1+age:time,link="sin")),delete=TRUE)
# Brownie Recovery
br=process.data(brownie,model="Brownie",groups="ReleaseAge",age.var=1,initial.ages=c(1,0))
br.ddl=make.design.data(br,parameters=list(S=list(age.bins=c(0,1,10)),
                               f=list(age.bins=c(0,1,10))),right=FALSE)
mod=mark(br,br.ddl,model.parameters=list(S=list(formula=~-1+age:time,link="sin"),
                               f=list(formula=~-1+age:time,link="sin")),delete=TRUE)
mod=mark(br,br.ddl,model.parameters=list(S=list(formula=~-1+age,link="sin"),
                               f=list(formula=~-1+age,link="sin")),delete=TRUE)
#Random effects Seber recovery (not run to save computation time)
#br=process.data(brownie,model="REDead",groups="ReleaseAge",age.var=1,initial.ages=c(1,0))
#br.ddl=make.design.data(br,parameters=list(S=list(age.bins=c(0,1,10)),
#                                       r=list(age.bins=c(0,1,10))),right=FALSE)
#mod=mark(br,br.ddl,model.parameters=list(S=list(formula=~age),r=list(formula=~age)),delete=TRUE)
#Pledger Mixture Seber recovery
br=process.data(brownie,model="PMDead",groups="ReleaseAge",
                           mixtures=3,age.var=1,initial.ages=c(1,0))
br.ddl=make.design.data(br,parameters=list(S=list(age.bins=c(0,1,10)),
                            r=list(age.bins=c(0,1,10))),right=FALSE)
mod=mark(br,br.ddl,model.parameters=list(pi=list(formula=~mixture),
                     S=list(formula=~age+mixture),r=list(formula=~age)),delete=TRUE)
br=process.data(brownie,model="PMDead",groups="ReleaseAge",
                     mixtures=2,age.var=1,initial.ages=c(1,0))
br.ddl=make.design.data(br,parameters=list(S=list(age.bins=c(0,1,10)),
                      r=list(age.bins=c(0,1,10))),right=FALSE)
mod=mark(br,br.ddl,model.parameters=list(pi=list(formula=~age),
                      S=list(formula=~age+mixture),r=list(formula=~age)),delete=TRUE)
Removes unused MARK output files
Description
Identifies all unused (orphaned) mark*.inp, .vcv, .res and .out and .tmp
files in the working directory and removes them.  The orphaned files are
determined by examining all mark objects and lists of mark objects (created
by collect.models) to create a list of files in use. All other
files are treated as orphans to delete.
Usage
cleanup(lx = NULL, ask = TRUE, prefix = "mark")
Arguments
| lx | listing of R objects; defaults to list of workspace from calling environment; if NULL it uses ls(envir=parent.frame()) | 
| ask | if TRUE, prompt whether each file should be removed. Typically will be used with ask=FALSE but default of TRUE may avoid problems | 
| prefix | prefix for filename if different than "mark" | 
Details
This function removes orphaned output files from MARK. This occurs when there are output files in the subdirectory that are not associated with a mark object in the current R session (.Rdata). For example, if you repeat an analysis or set of analyses and store them in the same object then the original set of output files would no longer be linked to an R object and would be orphaned.
As an example, consider the mallard analysis. The first time
you run the analysis script in an empty subdirectory it would create 9 sets
of MARK output files (mark001.out,.vcv,.res,.inp to
mark009.out,.vcv,.res,.inp) and each would be linked to one of the objects
in nest.results. When the command
AgePpnGrass=nest.results$AgePpnGrass was issued, both of those
mark objects were linked to the same set of output files.  Now if you
were to repeat the above commands and re-run the models and stored the
results into nest.results again, it would create files with prefixes
10 to 18. Because that would have replaced nest.results, none of the
files numbered 1 to 9 would be linked to an object.
cleanup(ask=FALSE) automatically removes those orphan files.  If you
delete all objects in the R session with the command
rm(list=ls(all=TRUE)), then subsequently cleanup(ask=FALSE)
will delete all MARK output files because all of them will be orphans.
Output files can also become orphans if MARK finishes but for some reason R
crashes or you forget to save your session before you exit R.  Orphan output
files can be re-linked to an R object without re-running MARK by re-running
the mark function in R and specifing the filename
argument to match the base portion of the orphaned output file (eg
"mark067").  It will create all of the necessary R objects and then asks if
you want to use the existing file.  If you respond affirmatively it will
link to the orphaned files.
Value
None
Author(s)
Jeff Laake
MARK model beta parameters
Description
MARK model beta parameters
Usage
## S3 method for class 'mark'
coef(object,...)
Arguments
| object | a MARK model object | 
| ... | additional non-specified argument for S3 generic function | 
Value
A vector or dataframe of beta estimates
Author(s)
Jeff Laake
Collect names of MARK model objects from list of R objects (internal function)
Description
Either names of all mark model objects (type=NULL) or names of
mark model objects of a specific type (type) are extracted
from a vector of R objects (lx) that was collected from the parent
environment (frame) of the function that calls collect.model.names.
Thus, it is two frames back (parent.frame(2)).
Usage
collect.model.names(lx, type = NULL, warning = TRUE)
Arguments
| lx | vector of R object names from parent.frame(2) | 
| type | either NULL (for all types) or a character model type (eg "CJS") | 
| warning | if TRUE warning given when models of different types are collected | 
Details
If type=NULL then the names of all objects of
class(x)[1]="mark" in lx are returned.  If type is
specified, then the names of all objects of class(x)=c("mark",type)
in lx are returned.
This function was written with the intention that it would be called from
other functions ( e.g., collect.models,
run.models) but it will work if called directly (e.g.,
collect.model.names( lx=ls())). While this function returns a vector
of model names, collect.models returns a list of model
objects.  The latter can be used to easily create a list of models created
in a function to be used as a return value without listing all the names of
the functions.  It uses collect.model.names to perform that function.
Value
model.list: a vector of mark model names
Author(s)
Jeff Laake
See Also
collect.models, run.models,
model.table
Collect MARK models into a list and optionally construct a table of model results
Description
Collects mark models contained in lx of specified type
(if any) and returns models in a list with a table of model results if
table=TRUE.
Usage
collect.models(
  lx = NULL,
  type = NULL,
  table = TRUE,
  adjust = TRUE,
  external = FALSE
)
Arguments
| lx | if NULL, constructs vector of object names ( | 
| type | either NULL (for all types) or a character vector model type (e.g.
 | 
| table | if TRUE, a table of model results is also included in the returned list | 
| adjust | if TRUE, adjusts number of parameters (npar) to number of columns in design matrix which modifies AIC | 
| external | if TRUE the mark objects are saved externally rather than in the resulting marklist; the filename for each object is kept in its place | 
Details
If lx is NULL a vector of object names in the parent frame is
constructed for lx.  Within lx all mark model objects
(i.e., class(x)[1]=="mark") are returned if type is NULL.  If
type is specified and is valid, then the names of all mark
model objects of the specified type (i.e., class(x) =
c("mark",type)) in lx are returned.  If table=TRUE a table of
model selection results is also included in the returned list.
This function was written to be able to easily collect a series of mark
models in a list without specifying the names of each model object.  This is
useful in constructing a return value of a function that runs a series of
models for a particular analysis. For an example see dipper.
Value
model.list: a list of mark models and optionally a table of
model results.
Note
This function and others that use it or use
collect.model.names to collect a series of models or assign a
value to a series of models (e.g., adjust.chat) should be used
with a degree of caution. It is important to understand the scope of the
collection.  If the call to this function is made at the R prompt, then it
will collect all models (of a particular type if any) within the
current .Rdata file.  If the call to this function (or one like it) is
called from within a function that runs a series of analyses then the
collection is limited to the function frame (i.e., only models defined
within the function). Thus, it is wise to either use a different .Rdata file
for each data set (e.g., one for dipper, another for edwards.eberhardt, etc)
or to run everything within functions as illustrated by dipper
or edwards.eberhardt. Using a separate .Rdata file is
equivalent to having separate .DBF/.FPT files with MARK. It is important to
note that functions such as adjust.chat will adjust the value
of chat across analyses unless specifically given a list of models to
adjust.
Author(s)
Jeff Laake
See Also
merge.mark,remove.mark,
collect.model.names,run.models,model.table,dipper
Examples
# see examples in dipper, edwards.eberhardt and example.data
Compute design data for a specific parameter in the MARK model (internal use)
Description
For a specific type of parameter (e.g., Phi, p, r etc), it creates a data
frame containing design data for each parameter of that type in the model as
structured by an all different PIM (parameter information matrix). The
design data are used in constructing the design matrix for MARK with
user-specified model formulae as in make.mark.model.
Usage
compute.design.data(
  data,
  begin,
  num,
  type = "Triang",
  mix = FALSE,
  rows = 0,
  pim.type = "all",
  secondary,
  nstrata = 1,
  tostrata = FALSE,
  strata.labels = NULL,
  subtract.stratum = strata.labels,
  common.zero = FALSE,
  sub.stratum = 0,
  limits = NULL,
  events = NULL,
  use.events = NULL,
  mscale = 1,
  subtract.events = NULL
)
Arguments
| data | data list created by  | 
| begin | 0 for survival type, 1 for capture type | 
| num | number of parameters relative to number of occasions (0 or -1) | 
| type | type of parameter structure (Triang (STriang) or Square) | 
| mix | if TRUE this is a mixed parameter | 
| rows | number of rows relative to number of mixtures | 
| pim.type | type of pim structure; either all (all-different) or time | 
| secondary | TRUE if a parameter for the secondary periods of robust design | 
| nstrata | number of strata for multistrata | 
| tostrata | set to TRUE for transition parameters | 
| strata.labels | labels for strata as identified in capture history | 
| subtract.stratum | for each stratum, the to.strata that is computed by subtraction or for HidMarkov it is the strata computed by subtraction for pi parameter | 
| common.zero | if TRUE, uses a common begin.time to set origin (0) for Time variable defaults to FALSE for legacy reasons but should be set to TRUE for models that share formula like p and c with the Time model | 
| sub.stratum | the number of strata to subtract for parameters that use mlogit across strata like pi and Omega for RDMSOpenMisClass | 
| limits | For RDMSOccRepro values that set row and col (if any) start on states | 
| events | vector of events if needed for parameter | 
| use.events | if TRUE, adds events to design data | 
| mscale | scalar for multi-scale occupancy model (number of mixtures) | 
| subtract.events | for each stratum either the stratum or event to compute by subtraction for mlogit parameter | 
Details
This function is called by make.design.data to create all of
the default design data for a particular type of model and by
add.design.data to add binned design data fields for a
particular type of parameter. The design data created by this function
include group, age, time and cohort as factors
variables and continuous (non-factor) versions of all but group.  In
addition, if groups have been defined for the data, then a data column is
added for each factor variable used to define the groups.  Also for specific
closed capture heterogeneity models (model="HetClosed", "FullHet",
"HetHug", "FullHetHug") the data column mixture is added to the
design data. The arguments for this function are defined for each model by
the function setup.model.
Value
design.data: a data frame containing all of the design data fields for a particular type of parameter
| group | group factor level | 
| age | age factor level | 
| time | time factor level | 
| cohort | cohort factor level | 
| Age | age as a continuous variable | 
| Time | time as a continuous variable | 
| Cohort | cohort as a continuous variable | 
| mixture | mixture factor level | 
| other
fields | any factor variables used to define groups | 
Author(s)
Jeff Laake
See Also
make.design.data, add.design.data
Compute estimates of link values
Description
Computes link values (design*beta) for real parameters, and var-cov from design matrix (design) and coefficients (beta)
Usage
compute.link(
  model,
  beta = NULL,
  design = NULL,
  data = NULL,
  parm.indices = NULL,
  vcv = TRUE
)
Arguments
| model | MARK model object | 
| beta | Estimates of beta parameters | 
| design | numeric design matrix for MARK model with any covariate values filled in | 
| data | dataframe with covariate values that are averaged for estimates | 
| parm.indices | index numbers from PIMS for rows in design matrix to use | 
| vcv | logical; if TRUE, returns v-c matrix of link values | 
Details
This function is very similar to compute.real except that it
provides estimates of link values before they are transformed to real
estimates using the inverse-link.  It is called by get.link to
make calculations but can be called separately. The value is always a
dataframe for the estimates and design data and optionally a
variance-covariance matrix.  See get.real for further details
about the arguments.
Value
estimates: If vcv=TRUE, a list is returned with elements
vcv.link and the dataframe estimates. If vcv=FALSE,
only the estimates dataframe is returned which has the same structure as in
get.real.
Author(s)
Jeff Laake
See Also
Compute link values from real parameters
Description
Computes link values from reals using 1-1 real to beta(=link) transformation. Also, creates a v-c matrix for the link values if vcv.real is specified.
Usage
compute.links.from.reals(
  x,
  model,
  parm.indices = NULL,
  vcv.real = NULL,
  use.mlogits = TRUE
)
Arguments
| x | vector of real estimates to be converted to link values | 
| model | MARK model object used only to obtain model structure/links etc. If function is being called for model averaged estimates, then any model in the model list used to construct the estimates is sufficient | 
| parm.indices | index numbers from PIMS for rows in design matrix(non-simplified indices); x[parm.indices] are computed | 
| vcv.real | v-c matrix for the real parameters | 
| use.mlogits | logical; if FALSE then parameters with mlogit links are transformed with logit rather than mlogit for creating confidence intervals for each value | 
Details
It has 2 uses both related to model averaged estimates. Firstly, it is used to transform model averaged estimates so the normal confidence interval can be constructed on the link values and then back-transformed to real space. The second function is to enable parametric bootstrapping in which the error distbution is assumed to be multivariate normal for the link values. From a single model, the link values are easily constructed from the betas and design matrix so this function is not needed. But for model averaging there is no equivalent because the real parameters are averaged over a variety of models with the same real parameter structure but differing design structures. This function allows for link values and their var-cov matrix to be created from the model averaged real estimates.
Value
A list with the estimates (link values) and the links that were used. If vcv.real = TRUE, then the v-c matrix of the links is also returned.
Author(s)
Jeff Laake
See Also
Compute estimates of real parameters
Description
Computes real estimates and var-cov from design matrix (design) and coefficients (beta) using specified link functions
Usage
compute.real(
  model,
  beta = NULL,
  design = NULL,
  data = NULL,
  se = TRUE,
  vcv = FALSE
)
Arguments
| model | MARK model object | 
| beta | estimates of beta parameters for real parameter computation | 
| design | design matrix for MARK model | 
| data | dataframe with covariate values that are averaged for estimates | 
| se | if TRUE returns std errors and confidence interval of real estimates | 
| vcv | logical; if TRUE, sets se=TRUE and returns v-c matrix of real estimates | 
Details
The estimated real parameters can be derived from the estimated beta
parameters, a completed design matrix, and the link function specifications.
MARK produces estimates of the real parameters, se and confidence intervals
but there are at least 2 situations in which it is useful to be able to
compute them after running the analysis in MARK: 1) adjusting confidence
intervals for estimated over-dispersion, and 2) making estimates for
specific values of covariates.  The first case is done in
get.real with a call to this function.  It is done by
adjusting the estimated standard error of the beta parameters by multiplying
it by the square root of chat to adjust for over-dispersion.  A
normal 95
+/- 1.96*se) and this is then back-transformed to the real parameters using
inverse.link with the appropriate inverse link function for
the parameter to construct a 95
There is one exception. For parameters using the mlogit
transformation, a logit transformation of each individual real Psi
and its se are used to derive the confidence interval. The estimated
standard error for the real parameter is also scaled by the square root of
the over-dispersion constant chat stored in model$chat. But,
the code actually computes the variance-covariance matrix rather than
relying on the values from the MARK output because real estimates will
depend on any individual covariate values used in the model which is the
second reason for this function.
New values of the real parameter estimates can easily be computed by simply
changing the values of the covariate values in the design matrix and
computing the inverse-link function using the beta parameter estimates.  The
covariate values to be used can be specified in one of 2 ways. 1) Prior to
making a call to this function, use the functions
find.covariates to extract the rows of the design matrix with
covariate values and either fill in those values aautomatically with the
options provided by find.covariates or edit those values to be
the ones you want and then use fill.covariates to replace the
values into the design matrix and use it as the value for the argument
design, or 2) automate this step by specifying a value for the
argument data which is used to take averages of the covariate values
to fill in the covariate entries of the design matrix.  In computing real
parameter estimates from individual covariate values it is important to
consider the scale of the individual covariates. By default, an analysis
with MARK will standardize covariates by subtracting the mean and dividing
by the standard deviation of the covariate value. However, in the
RMark library all calls to MARK.EXE do not standardize the covariates
and request real parameter estimates based on the mean covariate values.
This was done because there are many instances in which it is not wise to
use the standardization implemented in MARK and it is easy to perform any
standardization of the covariates with R commands prior to fitting the
models.  Also, with pre-standardized covariates there is no confusion in
specifying covariate values for computation of real estimates.  If the model
contains covariates and the argument design is not specified, the
design matrix is extracted from model and all individual covariate
values are assigned their mean value to be consistent with the default in
the MARK analysis.
If a value for beta is given, those values are used in place of the
estimates model$results$beta$estimate.
Value
A data frame (real) is returned if vcv=FALSE;
otherwise, a list is returned also containing vcv.real: 
| real | data frame containing estimates, and if se=TRUE or vcv=TRUE it also contains standard errors and confidence intervals and notation of whether parameters are fixed or at a boundary | 
| vcv.real | variance-covariance matrix of real estimates | 
Author(s)
Jeff Laake
See Also
get.real,fill.covariates,find.covariates,inverse.link,deriv_inverse.link
Examples
# see examples in fill.covariates
Convert MARK input file to RMark dataframe
Description
Converts encounter history inp files used to create a MARK project into a dataframe for use with RMark. Group structure in frequencies is converted to factor variables that can be used to create groups in RMark. Covariates are copied straight across. Only works with encounter history format for input files and not specialized ones for known-fate or Brownie models.
Usage
convert.inp(
  inp.filename,
  group.df = NULL,
  covariates = NULL,
  use.comments = FALSE
)
Arguments
| inp.filename | name of input file; inp extension is assumed and does not need to be specified | 
| group.df | dataframe with grouping variables that contains a row for each group defined in the input file row1=group1, row2=group2 etc. Names and number of columns in the dataframe is set by user to define grouping variables in RMark dataframe | 
| covariates | names to be assigned to the covariates defined in the inp file | 
| use.comments | if TRUE values within /* and */ on data lines are used as row.names for the RMark dataframe. Only use this option if they are unique values. | 
Details
The encounter history format for MARK is structured as follows: capture (encounter) history, followed by a frequency field for each group, followed by any covariates and then a semi-colon at the end of the line. Comments are allowed within /* and */. The RMark format is a dataframe with a different structure. Each record(row) in the dataframe is for one or more animals within a single group and if there is group structure then the dataframe contains factor variables that can be used to create groups. For example, the following is a little snippet of the same data with 2 groups Males/Females and a covariate weight in the two different formats:
MARK encounter history file (in make believe test.inp): 1001 1 0 10; 1101 0 2 5; 0101 3 1 6; RMark dataframe: ch freq sex weight 1001 1 M 10 1101 2 F 5 0101 3 M 6 0101 1 F 6
 To convert from the MARK format to the RMark format it is necessary to
define the variables used to define the groups (if any) and to define the
covariate field names (if any).  For the example above, if test.inp
is in the same directory as the current working directory, the call would
be:
test = convert.inp("test",group.df=data.frame(sex=c("M","F")),
covariates="weight")
Comments spanning lines in the .inp file are ignored and deleted as are
blank lines.  If each line has a unique identifier in the comments then by
setting use.comments=TRUE, the text of the comment (e.g.,tag number)
will be assigned as the row name in the RMark dataframe. This will only work
if each line only represents a single animal or a set of animals in a single
group.  If file was structured as follows:
MARK encounter history file (in make believe test.inp): 1001 1 0 10 /*1*/; 1101 0 2 5 /*2*/; 0101 3 1 6 /*3*/;
an error would occur
 Error in convert.inp("test", group.df = data.frame(sex =
c("M", "F")),: Row names not unique. Set use.comments to default value FALSE
because it would try to use "3" as the row name for the 3 males and the 1 female represented by the last row.
The extension .inp is optional for files with that extension. If the file has a different extension the entire filename must be specified.
Note that there are limitations to this function. You cannot have extra blank lines in the file, the number of columns (tab, space or comma delimited) must be the same in each line unless the line is just a comment line /* */. In the latter case, the /* must begin the line and the */ must end the line with no extra characters (blanks included) in before or after.
Value
Dataframe with fields ch(character encounter history), freq (frequency of encounter history), followed by grouping variables (if any) and then covariates (if any)
Author(s)
Jeff Laake
See Also
Examples
# This example is excluded from testing to reduce package check time
# MARK example input file
pathtodata=paste(path.package("RMark"),"extdata",sep="/")
dipper=convert.inp(paste(pathtodata,"dipper",sep="/"),
                    group.df=data.frame(sex=c("M","F")))
# Example input files that accompany the MARK electronic book 
#  \url{http://www.phidot.org/software/mark/docs/book/}
bd=convert.inp(paste(pathtodata,"blckduck",sep="/"),
         covariates=c("age","weight","winglen","ci"),use.comments=TRUE)
aa=convert.inp(paste(pathtodata,"aa",sep="/"),
      group.df=data.frame(sex=c("Poor","Good")))
adult=convert.inp(paste(pathtodata,"adult",sep="/"))
age=convert.inp(paste(pathtodata,"age",sep="/"))
age_ya=convert.inp(paste(pathtodata,"age_ya",sep="/"),
      group.df=data.frame(age=c("Young","Adult")))
capsid=convert.inp(paste(pathtodata,"capsid",sep="/"))
clogit_demo=convert.inp(paste(pathtodata,"clogit_demo",sep="/"))
deer=convert.inp(paste(pathtodata,"deer",sep="/"))
ed_males=convert.inp(paste(pathtodata,"ed_males",sep="/"))
F_age=convert.inp(paste(pathtodata,"f_age",sep="/"))
indcov1=convert.inp(paste(pathtodata,"indcov1",sep="/"),
         covariates=c("cov1","cov2"))
indcov2=convert.inp(paste(pathtodata,"indcov2",sep="/"),
          covariates=c("cov1","cov2"))
island=convert.inp(paste(pathtodata,"island",sep="/"))
linear=convert.inp(paste(pathtodata,"linear",sep="/"))
young=convert.inp(paste(pathtodata,"young",sep="/"))
transient=convert.inp(paste(pathtodata,"transient",sep="/"))
ms_gof=convert.inp(paste(pathtodata,"ms_gof",sep="/"))
m_age=convert.inp(paste(pathtodata,"m_age",sep="/"))
ms_cjs=convert.inp(paste(pathtodata,"ms_cjs",sep="/"))
ms_directional=convert.inp(paste(pathtodata,"ms_directional",sep="/"))
ed=convert.inp(paste(pathtodata,"ed",sep="/"),
            group.df=data.frame(sex=c("Male","Female")))
multigroup=convert.inp(paste(pathtodata,"multi_group",sep="/"),
            group.df=data.frame(sex=c(rep("Female",2),rep("Male",2)),
            Colony=rep(c("Good","Poor"),2)))
LD1=convert.inp(paste(pathtodata,"ld1",sep="/"),
           group.df=data.frame(age=c("Young","Adult")))
yngadt=convert.inp(paste(pathtodata,"yngadt",sep="/"),
            group.df=data.frame(age=c("Young","Adult")))
effect_size=convert.inp(paste(pathtodata,"effect_size",sep="/"),
             group.df=data.frame(colony=c("Poor","Good")))
effect_size3=convert.inp(paste(pathtodata,"effect_size3",sep="/"),
             group.df=data.frame(colony=c("1","2","3")))
Convert link values to real parameters
Description
Computes real parameters from link values
Usage
convert.link.to.real(x, model = NULL, links = NULL, fixed = NULL)
Arguments
| x | Link values to be converted to real parameters | 
| model | MARK model object | 
| links | vector of character strings specifying links to use in computation of reals | 
| fixed | vector of fixed values for real parameters that are needed for calculation of reals from mlogits when some are fixed | 
Details
Computation of the real parameter from the link value is relatively
straightforward for most links and the function inverse.link
is used.  The only exception is parameters that use the mlogit link
which requires the transformation across sets of parameters.  This is a
convenience function that does the necessary work to convert from link to
real for any set of parameters.  The appropriate links are obtained from
model$links unless the argument links is specified and they
will over-ride those in model.
Value
vector of real parameter values
Author(s)
Jeff Laake
See Also
Compute estimates of real parameters for multiple covariate values
Description
Computes real estimates for a dataframe of covariate values and the var-cov matrix of the real estimates.
Usage
covariate.predictions(
  model,
  data = NULL,
  indices = NULL,
  drop = TRUE,
  revised = TRUE,
  mata = FALSE,
  normal.lm = FALSE,
  residual.dfs = 0,
  alpha = 0.025,
  ...
)
Arguments
| model | MARK model object or marklist | 
| data | dataframe with covariate values used for estimates; if it contains a field called index the covariates in each row are only applied to the parameter with that index and the argument indices is not needed; if data is not specified or all individual covariate values are not specified, the mean individual covariate value is used for prediction. | 
| indices | a vector of indices from the all-different PIM structure for parameters to be computed (model.index value in the design data) | 
| drop | if TRUE, models with any non-positive variance for betas are dropped | 
| revised | if TRUE it uses eq 6.12 from Burnham and Anderson (2002) for model averaged se; otherwise it uses eq 4.9 | 
| mata | if TRUE, create model averaged tail area confidence intervals as described by Turek and Fletcher | 
| normal.lm | Specify normal.lm=TRUE for the normal linear model case, and normal.lm=FALSE otherwise. When normal.lm=TRUE, the argument 'residual.dfs' must also be supplied. See USAGE section, and Turek and Fletcher (2012) for additional details. | 
| residual.dfs | A vector containing the residual (error) degrees of freedom under each candidate model. This argument must be provided when the argument normal.lm=TRUE. | 
| alpha | The desired lower and upper error rate. Specifying alpha=0.025 corresponds to a 95 alpha=0.05 to a 90 Default value is alpha=0.025. This argument now works to set standard confidence interval as well using qnorm(1-alpha) for critical value. | 
| ... | additional arguments passed to specific functions | 
Details
This function has a similar use as compute.real except that it
is specifically designed to compute real parameter estimates for multiple
covariate values for either a single model or to compute model averaged
estimates across a range of models within a marklist. This is particularly
useful for computing and plotting the real parameter as a function of the
covariate with pointwise confidence bands (see example below). The function
also computes a variance-covariance matrix for the real parameters.  For
example, assume you had a model with two age classes of young and adult and
survial for young was a function of weight and you wanted to estimate
survivorship to some adult age as a function of weight.  To do that you need
the survival for young as a function of weight, the adult survival, the
variance of each and their covariance.  This function will allow you to
accomplish tasks like these and many others.
When a variance-covariance matrix is computed for the real parameters, it
can get too large for available memory for a large set of real parameters.
Most models contain many possible real parameters to accomodate the general
structure even if there are very few unique ones. It is necessary to use the
most general structure to accomodate model averaging.  Most of the time you
will only want to compute the values of a limited set of real parameters but
possibly for a range of covariate values.  Use the argument indices
to select the real parameters to be computed.  The index is the value that
the real parameter has been assigned with the all-different PIM structure.
If you looked at the row numbers in the design data for the
dipper example, you would see that the parameter for p and Phi
are both numbered 1 to 21.  But to deal with multiple parameters effectively
they are given a unique number in a specific order.  For the CJS model, p
follows Phi, so for the dipper example, Phi are numbered 1 to 21 and then p
are numbered 22 to 42. You can use the function PIMS to lookup
the parameter numbers for a parameter type if you use
simplified=FALSE. For example, with the dipper data,
you could enter PIMS(dipper.model,"p",simplified=FALSE) and you would
see that they are numbered 22 to 42. Alternatively, you can use
summary.mark with the argument se=TRUE to see the list
of indices named all.diff.index.  They are included in a dataframe
for each model parameter which enables them to be selected based on the
attached data values (e.g., time, group etc).  For example, if you fitted a
model called dipper.model then you could use
summary(dipper.model,se=TRUE)$real to list the indices for each
parameter.
The argument data is a dataframe containing values for the covariates
used in the models.  The names for the fields should match the names of the
covariates used in the model.  If a time-varying covariate is used then you
need to specify the covariate name with the time index included as it is
specified in the data. You do not need to specify all covariates that were
used.  If a covariate in one or more models is not included in data
then the mean value will be used for each missing covariate.  That can be
very useful when you are only interested in prediction for one type of
parameters (eg Phi) when there are many covariates that are not interesting
in another parameter (e.g., p).  For each row in data, each parameter
specified in indices is computed with those covariate values.  So if
there were 5 rows in data and 10 parameters were specified there would be 10
sets of 5 (50) estimates produced.  If you do not want the full pairing of
data and estimates, create a field called index in data and
the estimate for that parameter will be computed with those specific values.
For example, if you wanted parameter 1 to be computed with 5 different
values of a covariate and then parameter 7 with 2 different covariate
values, you could create a dataframe with 5 rows each having an index
with the value 1 along with the relevant covariate data and an additional 2
rows with the index with the value 7.  If you include the field
index in data then you do not need to give a value for the
argument indices. However, if you are making the computations for parameters 
that use an mlogit link you must use the separate indices argument. If you try to
use the data.frame(index=...,cov) approach with mlogit parameters and you have
covariate values, the function will stop with an error.  Also, if you only include 
a portion of the indices in an mlogit set, it will also stop and issue an error and
tell you the set of indices that should be included for that mlogit set.  If you
were allowed to exclude some indices the result would be incorrect.
Value
A list is returned containing a dataframe of estimates, a var-cov matrix, and a reals list:
| estimates | data frame containing estimates, se, confidence interval and the data values used to compute the estimates | 
| vcv | variance-covariance matrix of real estimates | 
| reaks | list of dataframes with the estimate and se used for each model | 
Author(s)
Jeff Laake
See Also
Examples
pathtodata=paste(path.package("RMark"),"extdata",sep="/")
# This example is excluded from testing to reduce package check time
#
# indcov1.R
#
# CJS analysis of the individual covariate data from 12.2 of
# Cooch and White
#
# Import data (indcov1.inp) and convert it from the MARK inp file 
# format to the RMark format using the function convert.inp  
# It is defined with 1 group but 2 individual covariates of mass and 
# sqmass
#
indcov1=convert.inp(paste(pathtodata,"indcov1",sep="/"),
               covariates=c("mass","sqmass"))
#
# Next create the processed dataframe and the design data.
#
  ind1.process=process.data(indcov1,model="CJS")
  ind1.ddl=make.design.data(ind1.process)
#
# Next create the function that defines and runs the set of models
# and returns a marklist with the results and a model.table.  
# It does not have any arguments but does use the ind1.process 
# and ind1.ddl objects created above in the workspace. The function 
# create.model.list is the function that creates a dataframe of the
# names of the parameter specifications for each parameter in that 
# type of model. If none are given for any parameter, the default
# specification will be used for that parameter in mark.  The 
# first argument of mark.wrapper is the model list of parameter 
# specifications.  Remaining arguments that are passed to
# mark must be specified using the argument=value specification 
# because the arguments of mark were not repeated in mark.wrapper 
# so they must be passed using the argument=value syntax.
#
ind1.models=function()
{
  Phi.dot=list(formula=~1)
  Phi.mass=list(formula=~mass)
  Phi.mass.plus.mass.squared=list(formula=~mass + sqmass)
  p.dot=list(formula=~1)
  cml=create.model.list("CJS")
  results=mark.wrapper(cml,data=ind1.process,ddl=ind1.ddl,adjust=FALSE,delete=TRUE)
  return(results)
}
#
# Next run the function to create the models and store the results in
# ind1.results which is a marklist. Note that beta estimates will differ
# from Cooch and White results because we are using covariate values 
# directly rather than standardized values.
#
ind1.results=ind1.models()
#
# Next compute real parameter values for survival as a function of 
# mass which are model-averaged over the fitted models.
#
minmass=min(indcov1$mass)
maxmass=max(indcov1$mass)
mass.values=minmass+(0:30)*(maxmass-minmass)/30
Phibymass=covariate.predictions(ind1.results,
   data=data.frame(mass=mass.values,sqmass=mass.values^2),
    indices=c(1))
#
# Plot predicted model averaged estimates by weight with pointwise 
# confidence intervals
#
plot(Phibymass$estimates$mass, Phibymass$estimates$estimate,
   type="l",lwd=2,xlab="Mass(kg)",ylab="Survival",ylim=c(0,.65))
lines(Phibymass$estimates$mass, Phibymass$estimates$lcl,lty=2)
lines(Phibymass$estimates$mass, Phibymass$estimates$ucl,lty=2)
# indcov2.R
#
# CJS analysis of the individual covariate data from 12.3 of 
# Cooch and White
#
# Import data (indcov2.inp) and convert it from the MARK inp file 
# format to the RMark format using the function convert.inp  It is
# defined with 1 group but 2 individual covariates of mass and 
# sqmass
#
indcov2=convert.inp(paste(pathtodata,"indcov2",sep="/"),
        covariates=c("mass","sqmass"))
#
# Standardize covariates
#
actual.mass=indcov2$mass
standardize=function(x,z=NULL)
{
   if(is.null(z))
   {
       return((x-mean(x))/sqrt(var(x)))
   }else 
   {
      return((x-mean(z))/sqrt(var(z)))
   }
}
indcov2$mass=standardize(indcov2$mass)
indcov2$sqmass=standardize(indcov2$sqmass)
#
# Next create the processed dataframe and the design data.
#
  ind2.process=process.data(indcov2,model="CJS")
  ind2.ddl=make.design.data(ind2.process)
#
# Next create the function that defines and runs the set of models and 
# returns a marklist with the results and a model.table.  It does not
# have any arguments but does use the ind1.process and ind1.ddl 
# objects created above in the workspace. The function create.model.list 
# is the function that creates a dataframe of the names of the parameter
# specifications for each parameter in that type of model. If none are 
# given for any parameter, the default specification will be used for
# that parameter in mark.  The first argument of mark.wrapper is the
# model list of parameter specifications.  Remaining arguments that are
# passed to mark must be specified using the argument=value specification 
# because the arguments of mark were not repeated in mark.wrapper so 
# they must be passed using the argument=value syntax.
#
ind2.models=function()
{
  Phi.dot=list(formula=~1)
  Phi.time=list(formula=~time)
  Phi.mass=list(formula=~mass)
  Phi.mass.plus.mass.squared=list(formula=~mass + sqmass)
  Phi.time.x.mass.plus.mass.squared=
        list(formula=~time:mass + time:sqmass)
  Phi.time.mass.plus.mass.squared=
     list(formula=~time*mass + sqmass+ time:sqmass)
  p.dot=list(formula=~1)
  cml=create.model.list("CJS")
  results=mark.wrapper(cml,data=ind2.process,ddl=ind2.ddl,adjust=FALSE,threads=2,delete=TRUE)
  return(results)
}
#
# Next run the function to create the models and store the results in
# ind2.results which is a marklist. Note that beta estimates will differ
# because we are using covariate values directly rather than 
# standardized values.
#
ind2.results=ind2.models()
#
# Next compute real parameter values for survival as a function of 
# mass which are model-averaged over the fitted models. They are 
# standardized individually so the values have to be chosen differently.
#
minmass=min(actual.mass)
maxmass=max(actual.mass)
mass.values=minmass+(0:30)*(maxmass-minmass)/30
sqmass.values=mass.values^2
mass.values=standardize(mass.values,actual.mass)
sqmass.values=standardize(sqmass.values,actual.mass^2)
Phibymass=covariate.predictions(ind2.results,
 data=data.frame(mass=mass.values,sqmass=sqmass.values),
  indices=c(1:7))
#
# Plot predicted model averaged estimates by weight with pointwise 
# confidence intervals
#
par(mfrow=c(4,2))
for (i in 1:7)
{
   mass=minmass+(0:30)*(maxmass-minmass)/30
   x=Phibymass$estimates
   plot(mass,x$estimate[x$par.index==i],type="l",lwd=2,
    xlab="Mass(kg)",ylab="Survival",ylim=c(0,1),main=paste("Time",i))
   lines(mass, x$lcl[x$par.index==i],lty=2)
   lines(mass, x$ucl[x$par.index==i],lty=2)
}
Example data for Closed Robust Design Multistrata
Description
Data and Script to simulate the MSCRD example of 15.7.1 from the MARK book Cooch and White
Format
A data frame with 557 observations on the following 2 variables.
- ch
- a character vector of encounter histories 
- freq
- a numeric vector of frequencies of each history 
Source
This example was constructed by Andrew Paul who is with Fish and Wildlife Division of the Alberta Provincial Government
References
For Cooch and White book see http://www.phidot.org/software/mark/
Examples
# This example is excluded from testing to reduce package check time
#Script to simulate the MSCRD 
#example of 15.7.1 from the MARK
#book
#created by AJP 21 Dec 2010
#convert .inp data - only needed to create crdms
#ch.data<-convert.inp("rd_simple1.inp")
data(crdms)
#set time intervals
#4 primary periods each with 3 secondary occasions
t.int<-c(rep(c(0,0,1),3),c(0,0))
#process data for RMark
crdms.data<-process.data(crdms,model="CRDMS",time.interval=t.int,
			strata.labels=c("1","U"))
#change Psi parameters that are obtained by subtraction
crdms.ddl<-make.design.data(crdms.data,
		parameters=list(Psi=list(subtract.stratum=c("1","1"))))
#create grouping index for unobserved p and c (i.e., always zero)
#up=as.numeric(row.names(crdms.ddl$p[crdms.ddl$p$stratum=="U",]))
# changed 16 Oct 2020 - jll
crdms.ddl$p$fix=ifelse(crdms.ddl$p$stratum=="U",0,NA)
crdms.ddl$c$fix=ifelse(crdms.ddl$c$stratum=="U",0,NA)
#create grouping index to fix Psi for unobs to unbos at time 1
#this isn't necessary but it allows this Psi to be fixed to a value
#that can be flagged and not erroneously interpreted
Psiuu1=as.numeric(row.names(crdms.ddl$Psi[crdms.ddl$Psi$stratum=="U"&
			crdms.ddl$Psi$time==1,]))
#create dummy variable for constraining last Psi in Markovian model
#variable is called ctime for constrained time
crdms.ddl$Psi$ctime=crdms.ddl$Psi$time
crdms.ddl$Psi$ctime[crdms.ddl$Psi$time==3]=2
do_example=function()
{
#Initial assumptions
S.dot=list(formula=~1)  #S equal for both states and constant over time
# p.session=list(formula=~session, share=TRUE,  #p=c varies with session 
# 		fixed=list(index=up,value=0)) #p set to zero for unobs
# changed 16 Oct 2020 - jll
p.session=list(formula=~session, share=TRUE)
#Model 1 - Markovian movement
Psi.markov=list(formula=~ctime+stratum,
		fixed=list(index=Psiuu1,value=9e-99)) #9e-99 is a flag
model.1=mark(crdms.data,ddl=crdms.ddl,
	model.parameters=list(S=S.dot,
		p=p.session,
		Psi=Psi.markov),threads=2,delete=TRUE)
#Model 2 - Random movement
Psi.rand=list(formula=~time)
model.2=mark(crdms.data,ddl=crdms.ddl,
	model.parameters=list(S=S.dot,
		p=p.session,
		Psi=Psi.rand),threads=2,delete=TRUE)
#Model 3 - No movement
Psi.fix=list(formula=~1,fixed=0)
model.3=mark(crdms.data,ddl=crdms.ddl,
	model.parameters=list(S=S.dot,
		p=p.session,
		Psi=Psi.fix),threads=2,delete=TRUE)
		
#collect and store models
crdms.res<-collect.models()
print(crdms.res)
invisible()
}
do_example()
Create mcmc object for analysis with coda
Description
Reads in mcmc file from program MARK in binary and returns an mcmc object that can be used with coda functions which are most easily accessed via codamenu().
Usage
create.mark.mcmc(
  filename,
  ncovs,
  nmeans,
  ndesigns,
  nsigmas,
  nrhos,
  nlogit,
  include = F
)
Arguments
| filename | name of file containing mcmc values | 
| ncovs | number of covariates | 
| nmeans | number of means | 
| ndesigns | number of designs | 
| nsigmas | number of sigmas | 
| nrhos | number of rhos | 
| nlogit | number of logits | 
| include | if TRUE it includes ir/propJumps fields | 
Value
An mcmc object if one chain and an mcmc list if more than one chain. Each can be used with the coda package
Author(s)
Jeff Laake
Creates a dataframe of all combinations of parameter specifications
Description
Creates a dataframe of all combinations of parameter specifications for each
parameter in a particular type of MARK model. It is used together
with mark.wrapper to run a series of models from sets of
parameter specifications.
Usage
create.model.list(model)
Arguments
| model | character string identifying the type of model (e.g., "CJS") | 
Details
This function scans the frame of the calling enviroment and collects all
list objects that contain a formula and have names that match
parameter. where parameter is the name of a type of parameter in the
model type.  For example, it looks for Phi. and p. for
model="CJS". Any number of characters can follow the period.  Each of
the named objects should specify a list that matches the structure of a
parameter specification as described in make.mark.model. It
only collects list objects that contain an element named formula,
thus it will not collect one like Phi.fixed=list(fixed=1).  If you
want to do something like that, specify it as
Phi.fixed=list(formula=~1,fixed=1). It is safest to use this inside a
function that defines all of the parameter specifications as shown in the
example below.  The primary use for this function is to create a dataframe
which is passed to mark.wrapper to construct and run each of
the models.  It was written as a separate function to provide flexibility to
add/delete/modify the list prior to passing to mark.wrapper.
For example, only certain combinations may make sense for some parameter
specifications.  Thus you could define a set to create all the combinations
and then delete the ones from the dataframe that do not make sense. you
want, add others and re-run the function and merge the resulting dataframes.
If there are no specifications found for a particular model parameter, it is
not included in the list and when it is passed to
make.mark.model, the default specification will be used for
that parameter.
Value
dataframe of all combinations of parameter specifications for a model. Each field (column) is the name of a type of parameter (e.g., p and Phi for CJS). The values are character strings identifying particular parameter specifications.
Author(s)
Jeff Laake
See Also
Examples
# This example is excluded from testing to reduce package check time
#
# Compare this to the run.dipper shown under ?dipper
# It is only necessary to create each parameter specification and 
# create.model.list and mark.wrapper will create and run models for 
# each combination. Notice that the naming of the parameter 
# specifications has been changed to accommodate format for 
# create.model.list. Only a subset of the parameter specifications 
# are used here in comparison to other run.dipper
#
data(dipper)
run.dipper=function()
{
     #
     # Process data
     #
     dipper.processed=process.data(dipper,groups=("sex"))
     #
     # Create default design data
     #
     dipper.ddl=make.design.data(dipper.processed)
     #
     # Add Flood covariates for Phi and p that have different values
     #
     dipper.ddl$Phi$Flood=0
     dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 |
                  dipper.ddl$Phi$time==3]=1
     dipper.ddl$p$Flood=0
     dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1
     #
     #  Define range of models for Phi
     #
     Phi.dot=list(formula=~1)
     Phi.time=list(formula=~time)
     Phi.sex=list(formula=~sex)
     Phi.Flood=list(formula=~Flood)
     #
     #  Define range of models for p
     #
     p.dot=list(formula=~1)
     p.time=list(formula=~time)
     p.Flood=list(formula=~Flood)
     #
     # Run all pairings of models
     #
     dipper.model.list=create.model.list("CJS")
     dipper.results=mark.wrapper(dipper.model.list,
              data=dipper.processed,ddl=dipper.ddl,delete=TRUE)
     #
     # Return model table and list of models
     #
     return(dipper.results)
}
dipper.results=run.dipper()
White-tailed deer double observer spotlight capture-recapture analysis
Description
This data represents a set of independent double observer road-transect survey data of white-tailed deer on Brosnan Forest, South Carolina surveyed in August, 2005-2009. The primary reason for this package is to provide a completely reproducible example of the analysis from Collier et al. (2012). We used a Huggins closed capture model implemented in MARK http://www.phidot.org/software/mark/ via RMark both of which will need to be installed on the system to use this package. The data have 2 time periods (primary observer (t1) was a thermal imager, secondary observer (t2) was a spotlight observer in the same vehicle on the same side) with the primary objective of the study being to evaluate the detection (recapture) rates of white-tailed deer using spotlights as a survey method.
Format
The format is a data frame with 4508 observations on the following 7 variables.
- SL (spotlight)
- 0/1 whether deer was missed/seen by the spotlight observer 
- TI (thermal imager)
- 0/1 whether deer was missed/seen by the thermal imager observer 
- Group
- Factor with 79 levels representing each unique paired (TI-SL) survey conducted 
- Year
- Factor with 5 levels for year of survey 
- MaxCount
- Count of maximum number of deer seen for each survey, only needed for bootstrapp analysis in MARK, not used in bfdeeR package 
- Cluster
- Value assigning each deer to a specific observation cluster, only needed for bootstrapp analysis in MARK, not used in bfdeeR package 
- MgmtUnit
- Management unit identification 
Details
In addition to detailing the analysis used by Collier et al. (2012), this example documents the 
use of the share argument in the RMark parameter specification because there is presently very little
documentation on the use of share. Parameters in MARK models rarely share columns of the design matrix. For
example while you might want to use the same covariate for survival and capture probability, you would never use the
same beta (same column of the design matrix) for each parameter.  However, there are exceptions when the parameters 
represent similar quantities and that is when the share argument is useful.  For example, in the closed capture models
p is initial capture probability and c is recapture probability. In this case, it would make perfect sense to use the same
column of the design matrix for both parameters. The most obvious case is to fit a model in which p=c.
In RMark, certain pairs of parameters have been identified as similar and shareable. These can be found in the file parameters.txt which is in the RMark directory in your R library. With each pair that is shareable, the first one listed is the primary parameter. When you want to share columns in the design matrix, share=TRUE is added to the specification of the primary parameter. A parameter specification is not given for the other secondary parameter when they are shared. When RMark, sees that the parameters are to be shared it creates a pooled set of design data and adds a column with the name of the secondary parameter and its value is 0 for the rows for the primary parameter and 1 for the rows for the secondary parameter. For example, with the closed capture model if share=TRUE is added to the parameter specification for p, a model is not specified for c, and the pooled design data set contains a field called c. The added field allows construction of models where there are restricted differences between the parameters. For example, p=list(formula=~time+c,share=TRUE) will fit a model in which capture probability varies by time and recapture probability includes an additive difference on the link scale. Because the design data are pooled when you share parameters, if you modify design data for one of the parameters, the other most be modified as as well, so the columns of the design data for both parameters are the same or RMark will give an error.
The argument share is used in all the candidate models in the below example analysis.  As a simplified example of how 
share works, look at the candidate models in the bfrun{} function call named mod.2 and mod.2a (note that
mod.2a was not included in the supplemental file available from the Journal of Wildlife Management and is only included in this package).  Both
of these models are conducting the exact same analysis, with the first mod.2, we used the formula ~time  (if you don't 
know what this means go read the MARKBOOK at http://www.phidot.org/software/mark/.  Notice, however, we used the argument
share in mod.2, which tells RMark to share columns of the MARK design matrix.  For comparison, so you can evaluate how 
share works for yourself, mod.2a recreates the same analysis as mod.2, but uses the approach more typical to MARK 
analyses where each parameter is specified independently and uniquely.
Author(s)
Bret Collier
References
Collier, B. A., S. S. Ditchkoff, J. B. Raglin, and C. R. Ruth. 2012. Spotlight surveys for white-tailed deer: monitoring panacea or exercise in futility? Journal of Wildlife Management, In Press.
Examples
# This example is excluded from testing to reduce package check time
data(deer)
x=data.frame(ch=paste(deer$TI, deer$SL, sep=""), Survey=factor(deer$Group), 
     Year=factor(deer$Year), Cluster=deer$Cluster, MgtUnit=factor(deer$MgmtUnit))
x$ch=as.character(x$ch)
bfrun=function(){
x.proc=process.data(x, model="Huggins", groups=c("Survey", "Year", "MgtUnit"))
x.ddl=make.design.data(x.proc)
#Silly Null model, constant p & c sharing 1 parameter (one detection estimate)
p.shared=list(formula=~1,share=TRUE)
mod.1=mark(x.proc, x.ddl, model.parameters=list(p=p.shared), invisible=FALSE,delete=TRUE)
 
#2 Parameter Null Model, constant p, constant c, different p and c (one estimate for each; p ne c)
#p(time), c(-), share=TRUE, detection is time dependent, with recapture parameter shared
p.sharetime=list(formula=~time, share=TRUE)
mod.2=mark(x.proc, x.ddl, model.parameters=list(p=p.sharetime), invisible=FALSE,delete=TRUE)
#2a Parameter Null Model, constant p, constant c,
# different p and c (one estimate for each; p ne c) not using share
mod.2a=mark(x.proc, x.ddl, model.parameters=list(p=list(formula=~1), c=list(formula=~1)),
            delete=TRUE)
#Fully parameterized model, different p and c for each survey transect replicate, 
# management unit, method (TI or SL) and any observers
p.survey=list(formula=~Survey*time, share=TRUE)
mod.3=mark(x.proc, x.ddl, model.parameters=list(p=p.survey), invisible=FALSE,delete=TRUE)
#p(MU), c(MU), initial detection and recapture differ and are management unit dependent
p.mu=list(formula=~MgtUnit*time, share=TRUE)
mod.4=mark(x.proc, x.ddl, model.parameters=list(p=p.mu), invisible=FALSE,delete=TRUE)
#p(MU) detection is management unit dependent
p.mu=list(formula=~MgtUnit, share=TRUE)
mod.5=mark(x.proc, x.ddl, model.parameters=list(p=p.mu), invisible=FALSE,delete=TRUE)
#p(Yr + MgtUnit),  detection is year + MgtUnit
p.yearMgtUnit=list(formula=~Year*time+MgtUnit, share=TRUE)
mod.6=mark(x.proc, x.ddl, model.parameters=list(p=p.yearMgtUnit), invisible=FALSE,delete=TRUE)
#p(Year), initial detection and recapture are year dependent
p.year=list(formula=~Year*time, share=TRUE)
mod.7=mark(x.proc, x.ddl, model.parameters=list(p=p.year), invisible=FALSE,delete=TRUE)
return(collect.models())
}
bf.out=bfrun()
bf.out
#export function to send dataset and covariates data to MARK for bootstrap analysis 
#(not run but here for completeness)
#export.MARK(x.proc, "BFdeer", mod.3, replace=TRUE, ind.covariates="all")
Compute delta method variance for sum, cumsum, prod and cumprod functions
Description
This function computes the delta method std errors or v-c matrix for a sum, cumsum (vector of cummulative sums), prod (product), or cumprod(vector of cummulative products) of a set of estimates.
Usage
deltamethod.special(function.name, mean, cov, ses = TRUE)
Arguments
| function.name | Quoted character string of either "sum", "cumsum", "prod" or "cumprod" | 
| mean | vector of estimates used in the function | 
| cov | variance-covariance matrix of the estimates | 
| ses | if TRUE it returns a vector of estimated standard errors of the function of the estimates and if FALSE it returns the variance-covariance matrix | 
Details
This function computes the delta method std errors or v-c matrix for a sum, cumsum (vector of cummulative sums), prod (product), or cumprod(vector of cummulative products). It uses the function deltamethod from the msm package and constructs the necessary formula for these special cases. It will load the msm pacakge but assumes that it has already been installed. See the msm documentation for a complete description on how the deltamethod function works. If ses=TRUE, it returns a vector of std errors for each of functions of the estimates contained in mean. If ses=F, then it returns a v-c matrix for the functions of the estimates contained in mean. cov is the input v-c matrix of the estimates.
Value
either a vector of standard errors (ses=TRUE) or a variance-covariance matrix (ses=FALSE)
Author(s)
Jeff Laake
Examples
# This example is excluded from testing to reduce package check time
#
# The following are examples only to demonstrate selecting different 
# model sets for adjusting chat and showing model selection table. 
# It is not a realistic analysis.
#
  data(dipper)
  mod=mark(dipper,model.parameters=list(Phi=list(formula=~time)),delete=TRUE)
  rr=get.real(mod,"Phi",se=TRUE,vcv=TRUE)
  deltamethod.special("prod",rr$estimates$estimate[1:6],rr$vcv.real)
  deltamethod.special("cumprod",rr$estimates$estimate[1:6],rr$vcv.real,ses=FALSE)
  deltamethod.special("sum",rr$estimates$estimate[1:6],rr$vcv.real)
  deltamethod.special("cumsum",rr$estimates$estimate[1:6],rr$vcv.real,ses=FALSE)
Derivatives of inverse of link function (internal use)
Description
Computes derivatives of inverse of link functions (real estimates) with respect to the beta parameters which are used for delta method computation of the var-cov matrix of real parameters.
Usage
deriv_inverse.link(real, x, link)
Arguments
| real | Vector of values of real parameters | 
| x | Matrix of design values | 
| link | Type of link function (e.g., "logit") | 
Details
Note: that function was renamed to deriv_inverse.link to avoid S3 generic class conflicts. The derivatives of the inverse of the link functions are simple computations using the real values and the design matrix values. The body of the function is as follows:
switch(link, logit=x*real*(1-real), log=x*real, loglog=-real*x*log(real), cloglog=-log(1-real)*x*(1-real), identity=x, mlogit=x*real*(1-real))
Value
Vector of derivative values computed at values of real parameters
Author(s)
Jeff Laake
See Also
Dipper capture-recapture data
Description
A capture-recapture data set on European dippers from France that accompanies MARK as an example analysis using the CJS and POPAN models. The dipper data set was orginally described as an example by Lebreton et al (1992).
Format
A data frame with 294 observations on the following 2 variables.
- ch
- a character vector containing the encounter history of each bird 
- sex
- the sex of the bird: a factor with levels - Female- Male
Details
This is a data set that accompanies program MARK as an example for CJS and
POPAN analyses.  The data can be stratified using sex as a grouping
variable.  The functions run.dipper, run.dipper.alternate,
run.dipper.popan defined below in the examples mimic the models used
in the dbf file that accompanies MARK. Note that the models used in the MARK
example use PIM coding with the sin link function which is often better at
identifying the number of estimable parameters.  The approach used in the R
code uses design matrices and cannot use the sin link and is less capable at
counting parameters.  These differences are illustrated by comparing the
results of run.dipper and run.dipper.alternate which fit the
same set of "CJS" models.  The latter fits the models with constraints on
some parameters to achieve identifiability and the former does not. Although
it does not influence the selection of the best model it does infleunce
parameter counts and AIC ordering of some of the less competitive models. In
using design matrices it is best to constrain parameters that are confounded
(e.g., last occasion parameters in Phi(t)p(t) CJS model) when possible to
achieve more reliable counts of the number of estimable parameters.  See
adjust.parameter.count for more dicussion on this point.
Note that the covariate "sex" defined in dipper has values "Male" and
"Female".  It cannot be used directly in a formula for MARK without using it
do define groups because MARK.EXE will be unable to read in a covariate with
non-numeric values.  By using groups="sex" in the call the
process.data a factor "sex" field is created that can be used
in the formula.  Alternatively, a new covariate could be defined in the data
with say values 0 for Female and 1 for Male and this could be used without
defining groups because it is numeric.  This can be done easily by
translating the values of the coded variables to a numeric variable.  Factor
variables are numbered 1..k for k levels in alphabetic order.  Since Female
< Male in alphabetic order then it is level 1 and Male is level 2.  So the
following will create a numeric sex covariate.
dipper$numeric.sex=as.numeric(dipper$sex)-1
See export.chdata for an example that creates a .inp file for
MARK with sex being used to describe groups and a numeric sex covariate.
Source
Lebreton, J.-D., K. P. Burnham, J. Clobert, and D. R. Anderson. 1992. Modeling survival and testing biological hypotheses using marked animals: case studies and recent advances. Ecol. Monogr. 62:67-118.
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
dipper.model=mark(dipper,delete=TRUE)
run.dipper=function()
{
#
# Process data
#
dipper.processed=process.data(dipper,groups=("sex"))
#
# Create default design data
#
dipper.ddl=make.design.data(dipper.processed)
#
# Add Flood covariates for Phi and p that have different values
#
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1
#
#  Define range of models for Phi
#
Phidot=list(formula=~1)
Phitime=list(formula=~time)
Phisex=list(formula=~sex)
Phisextime=list(formula=~sex+time)
Phisex.time=list(formula=~sex*time)
PhiFlood=list(formula=~Flood)
#
#  Define range of models for p
#
pdot=list(formula=~1)
ptime=list(formula=~time)
psex=list(formula=~sex)
psextime=list(formula=~sex+time)
psex.time=list(formula=~sex*time)
pFlood=list(formula=~Flood)
#
# Run assortment of models
#
dipper.phidot.pdot          =mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phidot,p=pdot),delete=TRUE)
dipper.phidot.pFlood      	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phidot,p=pFlood),delete=TRUE)
dipper.phidot.psex        	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phidot,p=psex),delete=TRUE)
dipper.phidot.ptime       	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phidot,p=ptime),delete=TRUE)
dipper.phidot.psex.time		=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phidot,p=psex.time),delete=TRUE)
dipper.phitime.ptime      	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phitime, p=ptime),delete=TRUE)
dipper.phitime.pdot       	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phitime,p=pdot),delete=TRUE)
dipper.phitime.psex		=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phitime,p=psex),delete=TRUE)
dipper.phitime.psex.time	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phitime,p=psex.time),delete=TRUE)
dipper.phiFlood.pFlood    	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=PhiFlood, p=pFlood),delete=TRUE)
dipper.phisex.pdot        	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex,p=pdot),delete=TRUE)
dipper.phisex.psex        	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex,p=psex),delete=TRUE)
dipper.phisex.psex.time        	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex,p=psex.time),delete=TRUE)
dipper.phisex.ptime       	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex,p=ptime),delete=TRUE)
dipper.phisextime.psextime	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisextime,p=psextime),delete=TRUE)
dipper.phisex.time.psex.time	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex.time,p=psex.time),delete=TRUE)
dipper.phisex.time.psex 	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex.time,p=psex),delete=TRUE)
dipper.phisex.time.pdot		=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex.time,p=pdot),delete=TRUE)
dipper.phisex.time.ptime	=mark(dipper.processed,dipper.ddl,
                 model.parameters=list(Phi=Phisex.time,p=ptime),delete=TRUE)
#
# Return model table and list of models
#
return(collect.models() )
}
dipper.results=run.dipper()
run.dipper.alternate=function()
{
#
# Process data
#
dipper.processed=process.data(dipper,groups=("sex"))
#
# Create default design data
#
dipper.ddl=make.design.data(dipper.processed)
#
# Add Flood covariates for Phi and p that have different values
#
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1
#
#  Define range of models for Phi
#
Phidot=list(formula=~1)
Phitime=list(formula=~time)
Phitimec=list(formula=~time,fixed=list(time=6,value=1))
Phisex=list(formula=~sex)
Phisextime=list(formula=~sex+time)
Phisex.time=list(formula=~sex*time)
PhiFlood=list(formula=~Flood)
#
#  Define range of models for p
#
pdot=list(formula=~1)
ptime=list(formula=~time)
ptimec=list(formula=~time,fixed=list(time=7,value=1))
psex=list(formula=~sex)
psextime=list(formula=~sex+time)
psex.time=list(formula=~sex*time)
psex.timec=list(formula=~sex*time,fixed=list(time=7,value=1))
pFlood=list(formula=~Flood)
#
# Run assortment of models
#
dipper.phidot.pdot          =mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phidot,p=pdot),delete=TRUE)
dipper.phidot.pFlood      	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phidot,p=pFlood),delete=TRUE)
dipper.phidot.psex        	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phidot,p=psex),delete=TRUE)
dipper.phidot.ptime       	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phidot,p=ptime),delete=TRUE)
dipper.phidot.psex.time		=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phidot,p=psex.time),delete=TRUE)
dipper.phitime.ptimec      	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phitime, p=ptimec),delete=TRUE)
dipper.phitime.pdot       	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phitime,p=pdot),delete=TRUE)
dipper.phitime.psex		=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phitime,p=psex),delete=TRUE)
dipper.phitimec.psex.time	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phitimec,p=psex.time),delete=TRUE)
dipper.phiFlood.pFlood    	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=PhiFlood, p=pFlood),delete=TRUE)
dipper.phisex.pdot        	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex,p=pdot),delete=TRUE)
dipper.phisex.psex        	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex,p=psex),delete=TRUE)
dipper.phisex.psex.time        	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex,p=psex.time),delete=TRUE)
dipper.phisex.ptime       	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex,p=ptime),delete=TRUE)
dipper.phisextime.psextime	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisextime,p=psextime),adjust=FALSE,delete=TRUE)
dipper.phisex.time.psex.timec	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex.time,p=psex.timec),delete=TRUE)
dipper.phisex.time.psex 	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex.time,p=psex),delete=TRUE)
dipper.phisex.time.pdot		=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex.time,p=pdot),delete=TRUE)
dipper.phisex.time.ptimec	=mark(dipper.processed,dipper.ddl,
                  model.parameters=list(Phi=Phisex.time,p=ptimec),delete=TRUE)
#
# Return model table and list of models
#
return(collect.models() )
}
dipper.results.alternate=run.dipper.alternate()
#
# Merge two sets of models into a single model list and include the 
# initial model as a demo for merge.mark
#
dipper.cjs=merge.mark(dipper.results,dipper.results.alternate,dipper.model)
dipper.cjs
#
# next delete some of the models to show how this is done with remove.mark
#
dipper.cjs=remove.mark(dipper.cjs,c(2,4,9))
dipper.cjs
run.dipper.popan=function()
{
#
# Process data
#
dipper.processed=process.data(dipper,model="POPAN",group="sex")
#
# Create default design data
#
dipper.ddl=make.design.data(dipper.processed)
#
# Add Flood covariates for Phi and p that have different values
#
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1
#
#  Define range of models for Phi
#
Phidot=list(formula=~1)
Phitime=list(formula=~time)
Phisex=list(formula=~sex)
Phisextime=list(formula=~sex+time)
Phisex.time=list(formula=~sex*time)
PhiFlood=list(formula=~Flood)
#
#  Define range of models for p
#
pdot=list(formula=~1)
ptime=list(formula=~time)
psex=list(formula=~sex)
psextime=list(formula=~sex+time)
psex.time=list(formula=~sex*time)
pFlood=list(formula=~Flood)
#
#  Define range of models for pent
#
pentsex.time=list(formula=~sex*time)
#
#  Define range of models for N
#
Nsex=list(formula=~sex)
#
# Run assortment of models
#
dipper.phisex.time.psex.time.pentsex.time=mark(dipper.processed,dipper.ddl,
model.parameters=list(Phi=Phisex.time,p=psex.time,pent=pentsex.time,N=Nsex),
invisible=FALSE,adjust=FALSE,delete=TRUE)
dipper.phisex.time.psex.pentsex.time=mark(dipper.processed,dipper.ddl,
model.parameters=list(Phi=Phisex.time,p=psex,pent=pentsex.time,N=Nsex),
invisible=FALSE,adjust=FALSE,delete=TRUE)
#
# Return model table and list of models
#
return(collect.models() )
}
dipper.popan.results=run.dipper.popan()
# *****************************************************************
# Here is an example of user specified links for each real parameter
 data(dipper)
 dipper.proc=process.data(dipper)
 dipper.ddl=make.design.data(dipper.proc)
# dummy run of make.mark.model to get links and design data. 
# parm.specific set to TRUE so it will create a link for 
# each parameter because for this model they are all the
# same (logit) and if this was not specified you'ld get a vector with one element
 dummy=make.mark.model(dipper.proc,dipper.ddl,simplify=FALSE,parm.specific=TRUE)
 input.links=dummy$links
# get model indices for p where time=4
 log.indices=dipper.ddl$p$model.index[dipper.ddl$p$time==4]
# assign those links to log
 input.links[log.indices]="Log"
# Now these can be used with any call to mark
 mymodel=mark(dipper.proc,dipper.ddl,input.links=input.links,delete=TRUE)
 summary(mymodel)
Rabbit capture-recapture data
Description
A capture-recapture data set on rabbits derived from Edwards and Eberhardt (1967) that accompanies MARK as an example analysis using the closed population models.
Format
A data frame with 76 observations on the following variable.
- ch
- a character vector 
Details
This data set is used in MARK to illustrate the various closed population
models including "Closed", "HetClosed", "FullHet","Huggins","HugHet", and
"FullHugHet".  The first 3 include N in the likelihood whereas the last 3
are based on the Huggins approach which does not use N in the likelihood.
The Het... and FullHet... models are based on the Pledger mixture model
approach. Some of the examples demonstrate the use of the share
argument in the model.parameters list for parameter p which
allows sharing common values for p and c.
Source
Edwards, W.R. and L.L. Eberhardt 1967. Estimating cottontail abundance from live trapping data. J. Wildl. Manage. 31:87-96.
Examples
# This example is excluded from testing to reduce package check time
#
# get data
#
data(edwards.eberhardt)
#
# create function that defines and runs the analyses as defined in 
# MARK example dbf file
#
run.edwards.eberhardt=function()
{
#
#  Define parameter models
#
pdotshared=list(formula=~1,share=TRUE)
ptimeshared=list(formula=~time,share=TRUE)
ptime.c=list(formula=~time+c,share=TRUE)
ptimemixtureshared=list(formula=~time+mixture,share=TRUE)
pmixture=list(formula=~mixture)
#
# Run assortment of models
#
#
#   Capture Closed models
#
#  constant p=c
ee.closed.m0=mark(edwards.eberhardt,model="Closed",
                   model.parameters=list(p=pdotshared),delete=TRUE)
#  constant p and constant c but different
ee.closed.m0c=mark(edwards.eberhardt,model="Closed",delete=TRUE)
#  time varying p=c
ee.closed.mt=mark(edwards.eberhardt,model="Closed",
                   model.parameters=list(p=ptimeshared),delete=TRUE)
#
#  Closed heterogeneity models
#
#  2 mixtures Mh2
ee.closed.Mh2=mark(edwards.eberhardt,model="HetClosed",
                   model.parameters=list(p=pmixture),delete=TRUE)
#  Closed Mth2 - p different for time; mixture additive
ee.closed.Mth2.additive=mark(edwards.eberhardt,model="FullHet",
                   model.parameters=list(p=ptimemixtureshared),adjust=TRUE,delete=TRUE)
#
#    Huggins models
#
# p=c constant over time
ee.huggins.m0=mark(edwards.eberhardt,model="Huggins",
                   model.parameters=list(p=pdotshared),delete=TRUE)
# p constant c constant but different; this is default model for Huggins
ee.huggins.m0.c=mark(edwards.eberhardt,model="Huggins",delete=TRUE)
# Huggins Mt
ee.huggins.Mt=mark(edwards.eberhardt,model="Huggins",
                   model.parameters=list(p=ptimeshared),adjust=TRUE,delete=TRUE)
#
#    Huggins heterogeneity models
#
#  Mh2 - p different for mixture
ee.huggins.Mh2=mark(edwards.eberhardt,model="HugHet",
                   model.parameters=list(p=pmixture),delete=TRUE)
#  Huggins Mth2 - p different for time; mixture additive
ee.huggins.Mth2.additive=mark(edwards.eberhardt,model="HugFullHet",
                   model.parameters=list(p=ptimemixtureshared),adjust=TRUE,delete=TRUE)
#
# Return model table and list of models
#
return(collect.models() )
}
#
# fit models in mark by calling function created above
#
ee.results=run.edwards.eberhardt()
Simulated data from Cormack-Jolly-Seber model
Description
A simulated data set from CJS model to demonstrate use of grouping variables
and individual covariates in an analysis of a mark model. The true
model for the simulated data is S(age+year)p(year).
Format
A data frame with 6000 observations on the following 5 variables.
- ch
- a character vector 
- weight
- a numeric vector 
- age
- a factor with levels - 1- 2- 3
- sex
- a factor with levels - F- M
- region
- a factor with levels - 1- 2- 3- 4
Details
The weight, age and region are static variables that
are defined based on the values when the animal was released. age is
a factor variable representing an age level.  The actual ages (at time of
release) are 0,1,2 for the 3 levels respectively.
Examples
# This example is excluded from testing to reduce package check time
data(example.data)
run.example=function()
{
PhiTime=list(formula=~time)
pTimec=list(formula=~time,fixed=list(time=7,value=1))
pTime=list(formula=~time)
PhiAge=list(formula=~age)
Phidot=list(formula=~1)
PhiweightTime=list(formula=~weight+time)
PhiTimeAge=list(formula=~time+age)
mod1=mark(example.data,groups=c("sex","age","region"),
                           initial.ages=c(0,1,2),delete=TRUE)
mod2=mark(example.data,model.parameters=list(p=pTimec,Phi=PhiTime),
          groups=c("sex","age","region"),initial.ages=c(0,1,2),delete=TRUE)
mod3=mark(example.data,model.parameters=list(Phi=Phidot,p=pTime),
          groups=c("sex","age","region"),initial.ages=c(0,1,2),delete=TRUE)
mod4=mark(example.data,model.parameters=list(Phi=PhiTime),
          groups=c("sex","age","region"),initial.ages=c(0,1,2),delete=TRUE)
mod5=mark(example.data,model.parameters=list(Phi=PhiTimeAge),
          groups=c("sex","age","region"),initial.ages=c(0,1,2),delete=TRUE)
mod6=mark(example.data,model.parameters=list(Phi=PhiAge,p=pTime),
          groups=c("sex","age","region"),initial.ages=c(0,1,2),delete=TRUE)
mod7=mark(example.data,model.parameters=list(p=pTime,Phi=PhiweightTime),
          groups=c("sex","age","region"),initial.ages=c(0,1,2),delete=TRUE)
mod8=mark(example.data,model.parameters=list(Phi=PhiTimeAge,p=pTime),
          groups=c("sex","age","region"),initial.ages=c(0,1,2),delete=TRUE)
return(collect.models())
}
example.results=run.example()
Export data and models for import in MARK
Description
Creates a .Rinp, .inp and optionally renamed output files that can be imported into MARK to create a new MARK project with all the data and output files.
Usage
export.MARK(
  x,
  project.name,
  model = NULL,
  replace = FALSE,
  chat = 1,
  title = "",
  ind.covariates = "all"
)
Arguments
| x | processed data list used to build models | 
| project.name | character string to be used for prefix of filenames and for MARK project name; do not use "." in the filename | 
| model | either a single mark model or a marklist | 
| replace | if TRUE it will replace any existing files | 
| chat | user-specified chat value if desired | 
| title | MARK project title string | 
| ind.covariates | vector of character strings specifying names of individual covariates or "all" to use all present | 
Details
If you use Nest model and NestAge covariate you should use the processed data list (model$data) from a model using NestAge in the formula because the necessary individual covariates are added to the processed data list. Also, use default of ind.covariates="all".
After running this function to export the data and models to the working directory, start program MARK and select the File/RMARK Import menu item. Navigate to the working directory and select the "project.name".Rinp file. MARK will take over and create the project files and will import any specified model output files.
DO NOT use a "project name" that is the same as a filename you use as input (*.inp) to RMark because this function will create a file "project name".inp and it would over-write your existing file and the format could change. If you try this, the code will return an error that the filename is invalid because the file already exists.
Value
None
Author(s)
Jeff Laake
See Also
Examples
# This example is excluded from testing to reduce package check time
data(mallard)
Dot=mark(mallard,nocc=90,model="Nest",
	model.parameters=list(S=list(formula=~1)),delete=TRUE)
mallard.proc=process.data(mallard,nocc=90,model="Nest")
# removed # to use export.MARK
#export.MARK(mallard.proc,"mallard",Dot,replace=TRUE)
data(robust)
time.intervals=c(0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0)
S.time=list(formula=~time)
p.time.session=list(formula=~-1+session:time,share=TRUE)
GammaDoublePrime.random=list(formula=~time,share=TRUE)
model.1=mark(data = robust, model = "Robust",
	time.intervals=time.intervals,
	model.parameters=list(S=S.time,
			GammaDoublePrime=GammaDoublePrime.random,p=p.time.session),delete=TRUE)
robust.proc=process.data(data = robust, model = "Robust",
	time.intervals=time.intervals)
#remove # to use export.MARK
#export.MARK(robust.proc,"robust",	model.1,replace=TRUE)
Export capture-history data to MARK .inp format
Description
Creates a MARK .inp file from processed data list that can be used to create a MARK .dbf file for use with MARK directly rather than with the RMark package.
Usage
export.chdata(data, filename, covariates = NULL, replace = FALSE)
Arguments
| data | processed data list resulting from process.data | 
| filename | quoted filename (without .inp extension) | 
| covariates | vector of names of covariate variables in data to include | 
| replace | if file exists and replace=TRUE, file will be over-written | 
Details
The default is to include none of the covariates in the processed data list. All of the covariates can be passed by setting covariates="all".
After you have created the MARK .dbf/.fpt files with the exported .inp file,
then you can use export.chdata to export models that can be
imported into the MARK interface.  However note the following: ***Warning***
Make sure that you use the .inp created by export.chdata with
your processed data to create the MARK .dbf file rather than using a
separate similar .inp file.  It is essential that the group structure and
ordering of groups matches between the .inp file and the exported models or
you can get erroneous results.
Value
None
Author(s)
Jeff Laake
See Also
Examples
data(dipper)
dipper$numeric.sex=as.numeric(dipper$sex)-1
dipper.processed=process.data(dipper,group="sex")
export.chdata(dipper.processed, filename="dipper", 
          covariates="numeric.sex",replace=TRUE)
file.remove("dipper.inp")
#
# Had sex been used in place of numeric.sex in the above command, 
# MARK would have been unable to use it as a covariate
# because it is not a numeric field
Export output files for appending into MARK .dbf/.fpt format
Description
Creates renamed versions of the output,vcv and residual files so they can be appended into a MARK .dbf file.
Usage
export.model(model, replace = FALSE)
Arguments
| model | a mark model object or marklist object | 
| replace | if file exists and replace=TRUE, file will be over-written | 
Details
If model is a marklist then it exports each model in the marklist.
The function simply copies the files with new names so the MARK interface
will recognize them. The marknnn.out is copied as marknnnY.tmp, marknnn.res
is copied as marknnnx.tmp and marknnn.vcv is copied as marknnnV.tmp.  You
can create a MARK .dbf by using export.chdata to create an
input file for MARK, opening MARK (MARKINT.EXE) to create a new .dbf with
the input file, and then using the Output/Append to select the output file
(marknnnY.tmp) to append the model with its files. Then you can use any
facilities of MARK that are not already included in RMark.
***Warning*** Make sure that you use the .inp created by
export.chdata with your processed data to create the MARK .dbf
file rather than using a separate similar .inp file.  It is essential that
the group structure and ordering of groups matches between the .inp file and
the exported models or you can get erroneous results.
Value
None
Author(s)
Jeff Laake
See Also
Examples
data(dipper)
mymodel=mark(dipper,threads=1,delete=TRUE)
# remove # to use export.model
#export.model(mymodel,replace=TRUE)
Various utility functions
Description
Miscellaneous set of functions that can be used with results from the package.
Usage
extract.indices(model,parameter,df)
 
        nat.surv(model,df)
         pop.est(ns,ps,design,p.vcv)
         compute.Sn(x,df,criterion)
         logitCI(x,se)
         search.output.files(x,string)
Arguments
| model | a mark model object | 
| parameter | character string for a type of parameter for that model (eg, "Phi","p") | 
| df | dataframe containing the columns group, row, column which specify the group number, the row number and column number of the PIM | 
| ns | vector of counts of animals captured | 
| ps | vector of capture probability estimates which match counts | 
| design | design matrix that specifies how counts will be aggregate | 
| p.vcv | variance-covariance matrix for capture probability estimates | 
| x | marklist of models for compute.Sn and a vector of real estimates for logitCI | 
| se | vector of std errors for real estimates | 
| criterion | vector of model selection criterion values (eg AICc) | 
| string | string to be found in output files contained in models in x | 
Details
Function extract.indices extracts the parameter indices from the
parameter index matrices (PIMS) for a particular type of parameter
that match a set of group numbers and rows and columns that are defined in
the dataframe df. It returns a vector of indices which can be used to
specify the set of real parameters to be extracted by
covariate.predictions using the index column in data or
the indices argument. If df is NULL, it returns a dataframe with all of
the indices with model.index being the unique index across all parameters and the
par.index which is an index to the row in the design data. If parameter is NULL then
the the dataframe is given for all of the parameters.
Function nat.surv produces estimates of natural survival (Sn) from
total survival (S) and recovery rate (r) from a joint live-dead model in
which all harvest recoveries are reported. In that case, Taylor et al 2005
suggest the following estimator of natural survival Sn=S + (1-S)*r.  The
arguments for the function are a mark model object and a dataframe
df that defines the set of groups and times (row,col) for the natural
survival computations. It returns a list with elements: 1) Sn - a
vector of estimates for natural survival; one for each entry in df
and 2) vcv - a variance-covariance matrix for the estimates of
natural survival.
Function pop.est produces estimates of abundance using a vector of
counts of animals captured (ns) and estimates of capture
probabilities (ps).  The estimates can be aggregated or averaged
using the design matrix argument.  If individual estimates are
needed, use an nxn identity matrix for design where n is the length of
ns. To get a total of all the estimates use a nx1 column matrix of
1s.  Any other design matrix can be specified to subset, aggregate
and/or average the estimates.  The argument p.vcv is needed to
compute the variance-covariance matrix for the abundance estimates using the
formula described in Taylor et al. (2002).  The function returns a list with
elements: 1) Nhat - a vector of abundance estimates and 2) vcv
- variance-covariance matrix for the abundance estimates.
Function Compute.Sn creates list structure for natural survival using
nat.surv to be used for model averaging natural survival estimates
(e.g., model.average(compute.Sn(x,df,criterion))). It returns a list
with elements estimates, vcv, weight: 1) estimates - matrix of estimates of
natural survival, 2)vcv - list of var-cov matrix for the estimates, and 3)
weight - vector of model weights.
Function search.output.filessearches for occurrence of a specific
string in output files associated with models in a marklist x. It returns a
vector of model numbers in the marklist which have an output file containing
the string.
Author(s)
Jeff Laake
References
TAYLOR, M. K., J. LAAKE, H. D. CLUFF, M. RAMSAY and F. MESSIER. 2002. Managing the risk from hunting for the Viscount Melville Sound polar bear population. Ursus 13: 185-202.
TAYLOR, M. K., J. LAAKE, P. D. MCLOUGHLIN, E. W. BORN, H. D. CLUFF, S. H. FERGUSON, A. ROSING-ASVID, R. SCHWEINSBURG and F. MESSIER. 2005. Demography and viability of a hunted population of polar bears. Arctic 58: 203-214.
Examples
# This example is excluded from testing to reduce package check time
# Example of computing N-hat for occasions 2 to 7 for the p=~time model
data(dipper)
md=mark(dipper,model.parameters=list(p=list(formula=~time),
       Phi=list(formula=~1)),delete=TRUE)
# Create a matrix from the capture history strings 
xmat=matrix(as.numeric(unlist(strsplit(dipper$ch,""))),
      ncol=nchar(dipper$ch[1]),byrow=TRUE)
# sum number of captures in each column but don't use the first 
# column because p[1] can't be estimated
ns=colSums(xmat)[-1]
# extract the indices and then get covariate predictions for p(2),...,p(7)
# which are row-colums 1-6 in PIM for p 
p.indices=extract.indices(md,"p",df=data.frame(group=rep(1,6),
   row=1:6,col=1:6))
p.list=covariate.predictions(md,data=data.frame(index=p.indices))
# call pop.est using diagonal design matrix to get 
# separate estimate for each occasion
pop.est(ns,p.list$estimates$estimate,
  design=diag(1,ncol=6,nrow=6),p.list$vcv)
Extract results from MARK output file (internal use)
Description
Extracts the lnl, AICc, npar, beta and real estimates and returns a list of
these results for inclusion in the mark object. The elements
beta and real are dataframes with fields estimate,se,lcl,ucl.
This function was written for internal use and is called by
run.mark.model. It is documented here for more advanced users
that might want to modify the code or adapt for their own use.
Usage
extract.mark.output(out, model, adjust, realvcv = FALSE, vcvfile)
Arguments
| out | output from MARK analysis ( | 
| model | mark model object | 
| adjust | if TRUE, adjusts number of parameters (npar) to number of columns in design matrix, modifies AIC and records both | 
| realvcv | if TRUE the vcv matrix of the real parameters is extracted and stored in the model results | 
| vcvfile | name of vcv file output | 
Value
result: list of extracted output elements
| lnl | -2xLog-likelihood | 
| deviance | Difference between saturated model and lnl | 
| npar | Number of model parameters | 
| AICc | Small-sample corrected AIC value using npar and n | 
| npar.unadjusted | Number of model parameters as reported by MARK if npar was adjusted | 
| AICc.unadjusted | Small-sample corrected AIC value using npar.unadjusted and n | 
| n | Effective sample size reported by MARK; used in AICc calculation | 
| beta | Dataframe of beta parameters with fields: estimate, se, lcl, ucl | 
| real | Dataframe of real parameters with fields: estimate, se, lcl, ucl | 
| derived.vcv | variance-covariance matrix for derived parameters if any | 
| covariate.values | dataframe with fields Variable and Value which are the covariate names and value used for real parameter estimates in the MARK output | 
| singular | indices of beta parameters that are non-estimable or at a boundary | 
| real.vcv | variance-covariance matrix for real parameters (simplified) if realvcv=TRUE | 
Author(s)
Jeff Laake
See Also
Fill covariate entries in MARK design matrix with values
Description
Replaces covariate names in design matrix with specific values to compute
estimates of real parameters at those values using the dataframe from
find.covariates after any value replacement.
Usage
fill.covariates(model, values)
Arguments
| model | MARK model object | 
| values | a dataframe matching structure of output from find.covariates with the user-defined values entered | 
Details
The design matrix for a MARK model with individual covariates contains the
covariate names used in the model.  In computing the real parameters for the
encounter history of an individual it replaces instances of covariate names
with the individual covariate values.  This function replaces the cells in
the design matrix that contain individidual covariates with user-specified
values which is an edited version (if needed) of the dataframe returned by
find.covariates.
Value
New design matrix with user-defined covariate values entered in place of covariate names
Author(s)
Jeff Laake
See Also
Examples
data(dipper)
dipper$nsex=as.numeric(dipper$sex)-1
dipper$weight=rnorm(294)   
#NOTE:  This generates random valules for the weights so the answers using 
# ~weight will vary each time it is run
mod=mark(dipper,model.parameters=list(Phi=list(formula=~nsex+weight)),delete=TRUE)
# Show approach using individual calls to find.covariates, fill.covariates 
# and compute.real
fc=find.covariates(mod,dipper)
fc$value[fc$var=="nsex"]=0 # assign sex value to Female
design=fill.covariates(mod,fc) # fill design matrix with values
# compute and output survivals for females at average weight
female.survival=compute.real(mod,design=design)[1,] 
female.survival
# Next show same thing with a call to compute.real and a data frame for 
# females and then males
# compute and output survivals for females at average weight
female.survival=compute.real(mod,data=
      data.frame(nsex=0,weight=mean(dipper$weight)))[1,] 
female.survival
male.survival=compute.real(mod,data=data.frame(nsex=1,
         weight=mean(dipper$weight)))[1,] 
male.survival
# Fit model using sex as a group/factor variable and 
# compute v-c matrix for estimates
mod=mark(dipper,groups="sex",
     model.parameters=list(Phi=list(formula=~sex+weight)),delete=TRUE)
survival.by.sex=compute.real(mod,data=dipper,vcv=TRUE)
survival.by.sex$real[1:2]  # estimates
survival.by.sex$se.real[1:2] # std errors
survival.by.sex$vcv.real[1:2,1:2] # v-c matrix
survival.by.sex$vcv.real[1,2]/prod(survival.by.sex$se.real[1:2]) 
# sampling correlation of the estimates
Find covariates in MARK design matrix
Description
Finds and extracts cells in MARK design matrix containing covariates.
Computes mean values of the covariates and assigns those as default values.
Returns dataframe that can be edited to replace default values which are
then inserted into the design matrix with fill.covariates to
enable computation of estimates of real parameters with
compute.real.
Usage
find.covariates(model, data = NULL, usemean = TRUE)
Arguments
| model | MARK model object | 
| data | dataframe used to construct MARK model object; not processed data list | 
| usemean | logical; if TRUE uses mean value of covariate for default and otherwise uses 0 | 
Details
The design matrix for a MARK model with individual covariates contains
entries with the covariate names used in the model. In computing the real
parameters for the encounter history of an individual it replaces instances
of covariate names with the individual covariate values.  This function
finds all of the cells in the design matrix that contain individidual
covariates and constructs a dataframe of the name of the real parameter, the
position (row, col) in the design matrix and a default value for the
covariate. The default field value is assigned to one of three values in the
following priority order: 1) the mean value for the covariates in data (if
data is not NULL), 2) the mean values used in the MARK output (if
data=NULL,usemean=TRUE), 3) 0 (if usemean=FALSE and data=NULL). The values
can also be modified using fc=edit(fc) where fc is the value
from this function.
Value
A dataframe with the following fields
| rnames | name of real parameter | 
| row | row number in design matrix (equivalent to
 | 
| col | column number in design matrix | 
| var | name of covariate | 
| value | value for covariate | 
Author(s)
Jeff Laake
See Also
Examples
# see examples in fill.covariates
Compute sets of link values for real parameters
Description
Computes link values for real parameters for a particular type of parameter (parameter) and returns in a table (dataframe) format.
Usage
get.link(
  model,
  parameter,
  beta = NULL,
  design = NULL,
  data = NULL,
  vcv = FALSE
)
Arguments
| model | MARK model object | 
| parameter | type of parameter in model (character) (e.g.,"Phi") | 
| beta | values of beta parameters for computation of link values | 
| design | a numeric design matrix with any covariate values filled in with numerical values | 
| data | covariate data to be averaged for estimates if design=NULL | 
| vcv | if TRUE computes and returns the v-c matrix of the subset of the link values | 
Details
This function is very similar to get.real except that it
provides estimates of link values before they are transformed to real
estimates using the inverse-link.  Also, the value is always a dataframe for
the estimates and design data and optionally a variance-covariance matrix.
See get.real for further details about the arguments.
Value
estimates: If vcv=TRUE, a list is returned with elements
vcv.link and the dataframe estimates. If vcv=FALSE,
only the estimates dataframe is returned which has the same structure as in
get.real.
Author(s)
Jeff Laake
See Also
Extract or compute sets of real parameters
Description
Extracts or computes real parameters for a particular type of parameter (parameter) and returns in either a table (dataframe) format or in PIM format.
Usage
get.real(
  model,
  parameter,
  beta = NULL,
  se = FALSE,
  design = NULL,
  data = NULL,
  vcv = FALSE,
  show.fixed = TRUE,
  expand = FALSE,
  pim = TRUE
)
Arguments
| model | MARK model object | 
| parameter | type of parameter in model (character) (e.g.,"Phi") | 
| beta | values of beta parameters for computation of real parameters | 
| se | if TRUE uses table format and extracts se and confidence intervals, if FALSE uses PIM format with estimates only | 
| design | a numeric design matrix with any covariate values filled in with numerical values | 
| data | covariate data to be averaged for estimates if design=NULL | 
| vcv | if TRUE computes and returns the v-c matrix of the subset of the real parameters | 
| show.fixed | if TRUE fixed values are returned rather than NA in place of fixed values | 
| expand | if TRUE, returns vcv matrix for unique parameters and only simplified unique parameters if FALSE | 
| pim | if TRUE and se=FALSE, returns a list of estimates in PIM format | 
Details
This function is called by summary.mark to extract(from
model$results$real) sets of parameters for display.  But, it can also
be useful to compute particular sets of real parameters for output,
manipulation or plotting etc.  It is closely related to
compute.real and it uses that function when it computes
(rather than extracts) real parameters. It provides an easier way to
extract/compute real estimates of a particular type (parameter).
The real parameter estimates are computed when either 1) model$chat >
1, 2) design, data, or beta are specified with
non-NULL values for those arguments, or 3) vcv=TRUE.  If none of the above
hold, then the estimates are extracted.
If se=FALSE and estimates are shown in triangular or square PIM
structure depending on the model and parameter.  For triangular, the lower
half of the matrix is shown as NA (not applicable in this case). If
se=TRUE, the estimate, standard error and confidence interval for the
real parameters with the accompanying design data are combined into a
dataframe.
If the model contains individual covariates, there are 3 options for
specifying the covariate values that are used for the real estimates.  If
neither design nor data are specified, then the real estimates
are computed with the average covariate values used in the MARK output.
This is what is done in the call from summary.mark.
Alternatively, the argument design can be given a numeric design
matrix for the model with the covariates given specific values.  This can be
done with find.covariates and fill.covariates.
Finally, a quicker approach is to specify a dataframe for data which
is averaged for the numeric covariate values and these are automatically
filled into the design matrix of the model with calls to
find.covariates and fill.covariates from
compute.real which is used for computation.  The second and
third options are essentially the same but creating the completed design
matrix will be quicker if multiple calls are made with the same completed
design matrix for different parameters.  The dataframe for data can
contain a single entry to specify certain values for computation.
Value
estimates: if se=FALSE and Beta=NULL, a matrix of estimates
or list of matrices for more than one group, and if se=TRUE or beta=is
not NULL and vcv=FALSE a dataframe of estimates with attached design data.
If vcv=TRUE, a list is returned with elements vcv.real and the
dataframe estimates as returned with se=TRUE.
Author(s)
Jeff Laake
See Also
Examples
data(example.data)
pregion=list(formula=~region)
PhiAge=list(formula=~Age)
mod=mark(example.data,model.parameters=list(p=pregion,Phi=PhiAge),
 groups=c("sex","age","region"),age.var=2,initial.ages=c(0,1,2),threads=1,delete=TRUE)
# extract list of Phi parameter estimates for all groups in PIM format 
Phi.estimates=get.real(mod,"Phi")  
# print out parameter estimates in triangular PIM format
for(i in 1:length(Phi.estimates))  
{
  cat(names(Phi.estimates)[i],"\n")
  print(Phi.estimates[[i]]$pim,na.print="")
}
require(plotrix)
#extract parameter estimates of capture probability p with se and conf intervals
p.table=get.real(mod,"p",se=TRUE) 
print(p.table[p.table$region==1,])  # print values from region 1
estimates=by(p.table$estimate,p.table$region,mean)
lcl=by(p.table$lcl,p.table$region,mean)
ucl=by(p.table$ucl,p.table$region,mean)
plotCI(c(1:4),estimates,ucl-estimates,estimates-lcl,xlab="Region",
         ylab="Capture probability",
		ylim=c(.5,1),main="Capture probability estimates by region")
Import capture-recapture data sets from space or tab-delimited files
Description
A relatively flexible function to import capture history data sets that include a capture (encounter) history read in as a character string and an arbitrary number of user specified covariates for the analysis.
Usage
import.chdata(
  filename,
  header = TRUE,
  field.names = NULL,
  field.types = NULL,
  use.comments = TRUE
)
Arguments
| filename | file name and path for file to be imported; fields in file should be space or tab-delimited | 
| header | TRUE/FALSE; if TRUE first line is name of variables | 
| field.names | vector of field names if header=FALSE; first field should always be ch - capture history remaining number of fields and their names are arbitrary | 
| field.types | vector identifying whether fields (beyond ch) are numeric ("n") or factor ("f") or should be skipped ("s") | 
| use.comments | if TRUE values within /* and */ on data lines are used as row.names for the RMark dataframe. Only use this option if they are unique values. | 
Details
This function was written both to be a useful tool to import data and as an
example for more specific import functions that a user may want to write for
data files that do not satisfy the requirements of this function.  In
particular this function will not handle files with fixed-width format files
that do not contain appropriate tab or space delimiters between the fields.
It also requires that the first field is the capture (encounter) history
which is named "ch" and is a character string.  The remaining fields are
arbitrary in number and type and are user defined based on the arguments to
the functions. Variables that will be used for grouping should be defined
with the field.type="f". Numeric individual covariates (e.g., weight)
should be input as field.type="n". Fields in the file that should not
be imported should be assigned field.type="s".  The examples below
illustrate different uses of the calling arguments to import several
different data sets that meet the modest requirements of this function.
If you specify a frequency for the encounter history, the field name must be
freq.  If you use any other name or spelling it will not be
recognized and the default frequency of 1 will be used for each encounter
history.  This function should not be used with files structured for input
into the MARK interface.  To use those types of files, see
convert.inp.  It is not neccessary to use either function to
create a dataframe for RMark.  All you need to is create a dataframe that
meets the specification of the RMark format.  For example, if you are
simulating data, you only need to create a dataframe with the fields ch,
freq (if differs from 1) and any covariates you want and then you can use
process.data on the dataframe.
If you have comments in your data file, they should not have a column header
(field name in first row).  If use.comments=TRUE the comments are
used as row names of the data frame and they must be unique.  If
use.comments=FALSE and the file contains comments they are stripped
out.
Value
A dataframe for use in MARK analysis with obligate ch
character field representing the capture (encounter) history and optional
covariate/grouping variables.
Author(s)
Jeff Laake
See Also
Examples
# This example is excluded from testing to reduce package check time
pathtodata=paste(path.package("RMark"),"extdata",sep="/")
example.data<-import.chdata(paste(pathtodata,"example.data.txt",sep="/"),
      field.types=c("n","f","f","f"))
edwards.eberhardt<-import.chdata(paste(pathtodata,"edwardsandeberhardt.txt",
      sep="/"),field.names="ch",header=FALSE)
dipper<-import.chdata(paste(pathtodata,"dipper.txt",sep="/"),
      field.names=c("ch","sex"),header=FALSE)
Inverse link functions (internal use)
Description
Computes values of inverse of link functions for real estimates.
Usage
inverse.link(x, link)
Arguments
| x | Matrix of design values multiplied by the vector of the beta parameter values | 
| link | Type of link function (e.g., "logit") | 
Details
The inverse of the link function is the real parameter value. They are
simple functions of X*Beta where X is the design matrix values
and Beta is the vector of link function parameters. The body of the
function is as follows:
switch(link, logit=exp(x)/(1+exp(x)), log=exp(x), loglog=exp(-exp(-x)), cloglog=1-exp(-exp(x)), identity=x, mlogit=exp(x)/(1+sum(exp(x))) )
The link="mlogit" only works if the set of real parameters are
limited to those within the set of parameters with that specific link.  For
example, in POPAN, the pent parameters are of type "mlogit" so the
probabilities sum to 1.  However, if there are several groups then each
group will have a different set of pent parameters which are
identified by a different grouping of the "mlogit" parameters (i.e.,
"mlogit(1)" for group 1, "mlogit(2)" for group 2 etc).  Thus, in computing
real parameter values (see compute.real) which may have
varying links, those with "mlogit" are not used with this function using
link="mlogit".  Instead, the link is temporarily altered to be of
type "log" (i.e., inverse=exp(x)) and then summed over sets with a common
value for "mlogit(j)" to construct the inverse for "mlogit" as
exp(x)/(1+sum(exp(x)).
Value
Vector of real values computed from x=X*Beta
Author(s)
Jeff Laake
See Also
compute.real,deriv_inverse.link
Killdeer nest survival example data
Description
A data set on killdeer that accompanies MARK as an example analysis for the nest survival model.
Format
A data frame with 18 observations on the following 6 variables.
- id
- a MARK comment field with a nest id 
- FirstFound
- the day the nest was first found 
- LastPresent
- the last day that chicks were present 
- LastChecked
- the last day the nest was checked 
- Fate
- the fate of the nest; 0=hatch and 1 depredated 
- Freq
- the frequency of nests with this data; usually 1 
Details
This is a data set that accompanies program MARK as an example for nest
survival. The data structure for the nest survival model is completely
different from the capture history structure used for most MARK models.  To
cope with these data you must import them into a dataframe using R commands
and assign the specific variable names shown above. The id and Freq fields
are optional.  Freq is assumed to be 1 if not given.  You cannot import the
MARK .inp file structure directly into R without some manipulation.  Also
note that import.chdata and convert.inp do NOT
work for nest survival data. In the examples section below, the first
section of code provides an example of converting the killdeer.inp file into
a dataframe for RMark.
If your dataframe contains a variable AgeDay1, which is the age of the nest on the first occasion then you can use a variable called NestAge which will create a set of time-dependent covariates named NestAge1,NestAge2 ...NestAge(nocc-1) which will provide a way to incorporate the age of the nest in the model. This was added because the age covariate in the design data for S assumes all nests are the same age and is not particularly useful. This effect could be incorporated by using the add() function in the design matrix but RMark does not have any capability for doing that and it is easier to create a time-dependent covariate to do the same thing.
Examples
# This example is excluded from testing to reduce package check time
# EXAMPLE CODE FOR CONVERSION OF .INP TO NECESSARY DATA STRUCTURE
# read in killdeer.inp file
#killdeer=scan("killdeer.inp",what="character",sep="\n")
# strip out ; and write out all but first 2 lines which contain comments
#write(sub(";","",killdeer[3:20]),"killdeer.txt")
# read in as a dataframe and assign names
#killdeer=read.table("killdeer.txt")
#names(killdeer)=c("id","FirstFound","LastPresent","LastChecked","Fate","Freq")
#
# EXAMPLE CODE TO RUN MODELS CONTAINED IN THE MARK KILLDEER.DBF
data(killdeer)
# produce summary
summary(killdeer)
# Define function to run models that are in killdeer.dbf
# You must specify either the number of occasions (nocc) or the time.intervals 
# between the occasions.
run.killdeer=function()
{
   Sdot=mark(killdeer,model="Nest",nocc=40,delete=TRUE)
   STime=mark(killdeer,model="Nest",
       model.parameters=list(S=list(formula=~I(Time+1))),nocc=40,threads=2,delete=TRUE)
   STimesq=mark(killdeer,model="Nest",
       model.parameters=list(S=list(formula=~I(Time+1)+I((Time+1)^2))),nocc=40,threads=2,
            delete=TRUE)
   STime3=mark(killdeer,model="Nest",
      model.parameters=list(S=list(formula=~I(Time+1)+I((Time+1)^2)+I((Time+1)^3))),
                   nocc=40,threads=2,delete=TRUE)
   return(collect.models())
}
# run defined models
killdeer.results=run.killdeer()
Lark Sparrow
Description
An example of Multiple Scale Occupancy model for some lark sparrow data that was contributed by David Pavlacky at Rocky Mountain bird observatory. The study design was a GRTS selection of paired "Deferred" and "Grazed" pastures. The point count locations within each pasture were a random selection of systematic point count locations separated by 250 m. Each point count had a radius of 125m. A removal design was used to the set the data to missing after the first detection. The point count data were set to missing when fewer than nine points were surveyed.
Format
A data frame with 52 observations on the following 20 variables.
- ch
- a character vector containing the encounter history 
- ceap
- a factor with two levels "Deferred" and "Grazed" corresponding to a rest rotation grazing system with pastures either rested (Deferred) or grazed (Grazed) during the spring bird breeding season. 
- cwx
- a continuous covariate for primary occasion x, representing an ocular estimate of the proportion of area covered by crested wheatgrass in a 50-m radius around the point count location. 
- tdx
- a continuous covariate for primary occasion x, representing the starting time (h) of each 6-min point count survey measured on the ratio scale (1.5 h = 1 h 30 min). 
Examples
# This example is excluded from testing to reduce package check time
# Create dataframe
data(LASP)
mscale=LASP
# Process data with MultScalOcc model and use group variables
mscale.proc=process.data(mscale,model="MultScalOcc",groups=c("ceap"),begin.time=1,mixtures=3)
# Create design data
ddl=make.design.data(mscale.proc)
# Create function to build models
do.Species=function()
{
p.1=list(formula=~1)   
p.2=list(formula=~ceap)    
p.3=list(formula=~td)
Theta.1=list(formula=~1)    
Theta.2=list(formula=~ceap)   
Theta.3=list(formula=~cw)
Psi.1=list(formula=~1)    
Psi.2=list(formula=~ceap)    
cml=create.model.list("MultScalOcc")
return(mark.wrapper(cml,data=mscale.proc,ddl=ddl,adjust=FALSE,realvcv=TRUE,delete=TRUE))
}
# Run function to get results
Species.results=do.Species()
# Output model table and estimates
Species.results$model.table
Species.results[[as.numeric(rownames(Species.results$model.table[1,]))]]$results$real
Species.results[[as.numeric(rownames(Species.results$model.table[1,]))]]$results$beta
#write.csv(Species.results$model.table,file="lasp_model_selection.csv",row.names=FALSE)
#write.csv(Species.results[[as.numeric(rownames(Species.results$model.table[1,]))]]$results$real,
#  file="lasp_m1_real.csv")
#write.csv(Species.results[[as.numeric(rownames(Species.results$model.table[1,]))]]$results$beta,
#  file="lasp_m1_beta.csv")
# Covariate prediction and model averaging
# p(time of day)
mintd <- min(mscale[,12:20])
maxtd <- max(mscale[,12:20])
td.values <- mintd+(0:100)*(maxtd-mintd)/100
PIMS(Species.results[[1]],"p",simplified=FALSE)
td <- covariate.predictions(Species.results,data=data.frame(td1=td.values),indices=c(21))
#write.table(td$estimates,file="lasp_cov_pred_p_td.csv",sep=",",col.names=TRUE,
#             row.names=FALSE)
# Theta(crested wheatgrass cover)
mincw <- min(mscale[,3:11])
maxcw <- max(mscale[,3:11])
cw.values <- mincw+(0:100)*(maxcw-mincw)/100
PIMS(Species.results[[1]],"Theta",simplified=FALSE)
cw <- covariate.predictions(Species.results,data=data.frame(cw1=cw.values),indices=c(3))
#write.table(cw$estimates,file="lasp_cov_pred_theta_cw.csv",sep=",",col.names=TRUE,
# row.names=FALSE)
# Psi(ceap grazing for wildlife practice)
ceap.values <- as.data.frame(matrix(c(1,2),ncol=1))
names(ceap.values) <- c("index")
PIMS(Species.results[[1]],"Psi",simplified=FALSE)
ceap <- covariate.predictions(Species.results,data=data.frame(ceap=ceap.values))
#write.table(ceap$estimates,file="lasp_cov_pred_psi_ceap.csv",sep=",",col.names=TRUE,
# row.names=FALSE)
Load external model
Description
Loads external model into workspace with name model
Usage
load.model(model)
Arguments
| model | name of MARK model object stored externally | 
Value
None; side effect only to load object with name model (not name given to model)
Author(s)
Jeff Laake
Create design dataframes for MARK model specification
Description
For each type of parameter in the analysis model (e.g, p, Phi, r), this
function creates design data based on the parameter index matrix (PIM) in
MARK terminology. Design data relate parameters to the sampling and data
structures; whereas data relate to the object(animal) being sampled.
Design data relate parameters to time, age, cohort and group structure.  Any
variables in the design data can be used in formulas to specify the model in
make.mark.model.
Usage
make.design.data(
  data,
  parameters = list(),
  remove.unused = FALSE,
  right = TRUE,
  common.zero = FALSE
)
Arguments
| data | Processed data list; resulting value from process.data | ||||||||||||||||
| parameters | Optional list containing a list for each type of parameter (list of lists); each parameter list is named with the parameter name (eg Phi); each parameter list can contain vectors named age.bins,time.bins and cohort.bins 
 | ||||||||||||||||
| remove.unused | If TRUE, unused design data are deleted; see details below (as of v3.0.0 this argument is no longer used) | ||||||||||||||||
| right | If TRUE, bin intervals are closed on the right | ||||||||||||||||
| common.zero | if TRUE, uses a common begin.time to set origin (0) for Time variable defaults to FALSE for legacy reasons but should be set to TRUE for models that share formula like p and c with the Time model | 
Details
After processing the data, the next step is to create the design data for building the models which is done with this function. The design data are different than the capture history data that relates to animals. The types of parameters and design data are specific to the type of analysis. For example, consider a CJS analysis that has parameters Phi and p. If there are 4 occasions, there are 3 cohorts and potentially 6 different Phi and 6 different p parameters for a single group. The format for each parameter information matrix (PIM) in MARK is triangular. RMark uses the all different formulation for PIMS by default, so the PIMs would be
Phi p 1 2 3 7 8 9 4 5 10 11 6 12
If you chose pim.type="time" for each parameter in "CJS", then the PIMS are structured as
Phi p 1 2 3 4 5 6 2 3 5 6 3 6
That structure is only useful if there is only a single release cohort represented by the PIM. If you choose this option and there is more than one cohort represented by the PIM then it will restrict the possible set of models that can be represented.
Each of these parameters relates to different times, different cohorts (time of initial release) and different ages (at least in terms of time since first capture). Thus we can think of a data frame for each parameter that might look as follows for Phi for the all different structure:
Index time cohort age 1 1 1 0 2 2 1 1 3 3 1 2 4 2 2 0 5 3 2 1 6 3 3 0
With this design data, one can envision models that describe Phi in terms of the variables time, cohort and age. For example a time model would have a design matrix like:
Int T2 T3 1 1 0 0 2 1 1 0 3 1 0 1 4 1 1 0 5 1 0 1 6 1 0 1
Or a time + cohort model might look like
Int T2 T3 C2 C3 1 1 0 0 0 0 2 1 1 0 0 0 3 1 0 1 0 0 4 1 1 0 1 0 5 1 0 1 1 0 6 1 0 1 0 1
 While you could certainly develop these designs
manually within MARK, the power of the R code rests with the internal R
function model.matrix which can take the design data and
create the design matrix from a formula specification such as ~time
or ~time+cohort alleviating the need to create the design matrix
manually.  While many analyses may only need age, time or cohort, it is
quite possible to extend the kind of design data, to include different
functions of these variables or add additional variables such as effort.
One could consider design data for p as follows: 
Index time cohort age effort juvenile 7 1 1 1 10 1 8 2 1 2 5 0 9 3 1 3 15 0 10 2 2 1 5 1 11 3 2 2 15 0 12 3 3 1 15 1
 The added columns represent a time dependent
covariate (effort) and an age variable of juvenile/adult. With these design
data, it is easy to specify different models such as ~time,
~effort, ~effort+age or ~effort+juvenile.
With the simplest call:
ddl=make.design.data(proc.example.data)
the function creates default design data for each type of parameter in the
model as defined by proc.example.data$model.  If
proc.example.data was created with the call in the first example of
process.data, the model is "CJS" (the default model) so the
return value is a list with 2 data frames: one for Phi and one for p. They
can be accessed as ddl$Phi (or ddl[["Phi"]]) and ddl$p
(or ddl[["p"]]) or as ddl[[1]] and ddl[[2]]
respectively.  Using the former notation is easier and safer because it is
independent of the ordering of the parameters in the list. For this example,
there are 16 groups and each group has 21 Phi parameters and 21 p
parameters. Thus, there are 336 rows (parameters) in the design data frame
for both Phi and p and thus a total of 772 parameters.
The default fields in each dataframe are typically group, cohort,
age, time, Cohort, Age, and Time. The
first 4 fields are factor variables, whereas Cohort, Age and
Time are numeric covariate versions of cohort, age, and
time shifted so the first value is always zero. However, there are
additional fields that are added depending on the capture-recapture model and 
the parameter in the model. For example, in multistrata models the default data 
include stratum in survival(S) and stratum and tostratum in Psi, the transition 
probabilities.  Also, for closed capture heterogeneity models a factor variable 
mixture is included. It is always best to examine the design data after 
creating them because those fields are your "data" for building models in 
addition to individual covariates in the capture history data.
If groups were created in the call to
process.data, then the factor variables used to create the
groups are also included in the design data for each type of
parameter.  If one of the grouping variables is an age variable it is named
initial.age.class to recognize explicitly that it represents a static
initial age and to avoid naming conflicts with age and Age
variables which represent dynamic age variables of the age of the animal
through time.  Non-age related grouping variables are added to the design
data using their names in data.  For example if
proc.example.data is defined using the first example in
process.data, then the fields sex, initial.age.class
(in place of age in this case), and region are added to the
design data in addition to the group variable that has 16 levels. The
levels of the group variable are created by pasting (concatenating)
the values of the grouping factor in order.  For example, M11 is sex=M, age
class=1 and region=1.
By default, the factor variables age, time, and cohort
are created such that there is a factor level for each unique value.  By
specfying values for the argument parameters, the values of
age, time, and cohort can be binned (put into
intervals) to reduce the number of levels of each factor variable. The
argument parameters is specified as a list of lists. The first level
of the list specifies the parameter type and the second level specifies the
variables (age, time, or cohort) that will be binned
and the cutpoints (endpoints) for the intervals.  For example, if you
expected that survival may change substantially to age 3 (i.e. first 3 years
of life) but remain relatively constant beyond then, you could bin the ages
for survival as 0,1,2,3-8.  Likewise, as well, you could decide to bin time
into 2 intervals for capture probability in which effort and expected
capture probability might be constant within each interval.  This could be
done with the following call using the argument parameters:
ddl=make.design.data(proc.example.data, parameters=list(Phi=list(age.bins=c(0,0.5,1,2,8)), p=list(time.bins=c(1980,1983,1986))))
In the above example, age is binned for Phi but not for p; likewise
time is binned for p but not for Phi. The bins for age were defined
as 0,0.5,1,2,8 because the intervals are closed ("]" - inclusive) on the
right by default and open ("(" non-inclusive) on the left, except for the
first interval which is closed on the left.  Had we used 0,1,2,8, 0 and 1
would have been in a single interval.  Any value less than 1 and greater
than 0 could be used in place of 0.5. Alternatively, the same bins could be
specified as:
ddl=make.design.data(proc.example.data, parameters=list(Phi=list(age.bins=c(0,1,2,3,8)), p=list(time.bins=c(1980,1984,1986))),right=FALSE)
To create the design data and maintain flexibility, I recommend creating the
default design data and then adding other binned variables with the function
add.design.data.  The 2 binned variables defined above can be
added to the default design data as follows:
ddl=make.design.data(proc.example.data) ddl=add.design.data(proc.example.data,ddl,parameter="Phi",type="age", bins=c(0,.5,1,2,8),name="agebin") ddl=add.design.data(proc.example.data,ddl,parameter="p",type="time", bins=c(1980,1983,1986),name="timebin")
Adding the other binned variables to the default design data, allows models
based on either time, timebin, or Time for p and age, agebin or Age for Phi.
Any number of additional binned variables can be defined as long as they are
given unique names. Note that R is case-specific so ~Time specifies a
model which is a linear trend over time ((e.g. Phi(T) or p(T) in MARK)
whereas ~time creates a model with a different factor level for each
time in the data (e.g. Phi(t) or p(t) in MARK) and ~timebin
creates a model with 2 factor levels 1980-1983 and 1984-1986.
Some circumstances may require direct manipulation of the design data to
create the needed variable when simple binning is not sufficient or when the
design data is a variable other than one related to time, age,
cohort or group (e.g., effort index).  This can be done with
any of the vast array of R commands.  For example, consider a situation in
which 1983 and 1985 were drought years and you wanted to develop a model in
which survival was different in drought and non-drought years.  This could
be done with the following commands:
ddl$Phi$drought=0
ddl$Phi$drought[ddl$phi$time==1983 | ddl$Phi$time==1985]= 1
The first command creates a variable named drought in the Phi design data
and initializes it with 0.  The second command changes the drought variable
to 1 for the years 1983 and 1985. The single brackets [] index a data frame,
matrix or vector.  In this case ddl$Phi$drought is a vector and
ddl$Phi$time==1983 | ddl$Phi$time==1985 selects the values in which
time is equal (==) to 1983 or ("|") 1985.  A simpler example might occur if
we want to create a function of one of the continuous variables.  If we
wanted to define a model for p that was a function of age and age squared,
we could add the age squared variable as:
ddl$p$Agesq=ddl$p$Age^2
Any of the fields in the design data can be used in formulae for the
parameters.  However, it is important to recognize that additional variables
you define and add to the design data are specific to a particular type of
parameter (e.g., Phi, p, etc). Thus, in the above example, you could not use
Agesq in a model for Phi without also adding it to the Phi design data.  As
described in make.mark.model, there is actually a simpler way
to add simple functions of variables to a formula without defining them in
the design data.
The above manipulations are sufficient if there is only one or two variables
that need to be added to the design data.  If there are many covariates that
are time(occasion)-specific then it may be easier to setup a dataframe with
the covariate data and use merge_design.covariates.
The fields that are automatically created in the design data depends on the
model.  For example, with models such as "POPAN" or any of the "Pradel"
models, the PIM structure is called square which really means that it is a
single row and all the rows are the same length for each group.  Thus,
rectangular or row may have been a better descriptor.  Regardless, in this
case there is no concept of a cohort within the PIM which is equivalent to a
row within a triangular PIM for "CJS" type models. Thus, for parameters with
"Square" PIMS the cohort (and Cohort) field is not generated.  The cohort
field is also not created if pim.type="time" for "Triangular" PIMS,
because that structure has the same structure for each row (cohort) and
adding cohort effects would be inappropriate.
For models with "Square" PIMS or pim.type="time" for "Triangular"
PIMS, it is possible to create a cohort variable by defining the cohort
variable as a covariate in the capture history data and using it as a
variable for creating groups.  As with all grouping variables, it is added
to the design data.  Now the one caution with "Square" PIMS is that they are
all the same length.  Imagine representing a triangular PIM with a set of
square PIMS with each row being a cohort.  The resulting set of PIMS is now
rectangular but the lower portion of the matrix is superfluous because the
parameters represent times prior to the entry of the cohort, assuming that
the use of cohort is to represent a birth cohort.  This is only problematic
for these kinds of models when the structure accomodates age and the concept
of a birth cohort.  The solution to the problem is to delete the design data
for the superfluous parameters after is is created (see warning below).
For example, let us presume that you used cohort with 3 levels 
as a grouping variable for a model with "Square" PIMS which has 3 occasions.  
Then, the PIM structure would look as follows for Phi: 
Phi 1 2 3 4 5 6 7 8 9
. If each row represented a cohort that entered at occasions 1,2,3 then parameters 4,7,8 are superfluous or could be thought of as representing cells that are structural zeros in the model because no observations can be made of those cohorts at those times.
After creating the design data, the unneeded rows can be deleted with R
commands or you can use the argument remove.unused=TRUE.  As an
example, a code segment might look as follows if chdata was defined
properly: 
mydata=process.data(chdata,model="POPAN",groups="cohort") ddl=make.design.data(mydata) ddl$Phi=ddl$Phi[-c(4,7,8),]
If cohort and time were suitably defined an easier solution especially for a larger problem would be
ddl$Phi=ddl$Phi[as.numeric(ddl$Phi$time)>=as.numeric(ddl$Phi$cohort),]
Which would only keep parameters in which the time is the same or greater
than the cohort.  Note that time and cohort would be factor variables and <
and > do not make sense which is the reason for the as.numeric which
translates the factor to a numeric ordering of factors (1,2,...) but not the
numeric value of the factor level (e.g., 1987,1998).  Thus, the above
assumes that both time and cohort have the same factor levels. The design
data is specific to each parameter, so the unneeded parameters need to be
deleted from design data of each parameter.
However, all of this is done automatically by setting the argument
remove.unused=TRUE.  It functions differently depending on the type
of PIM structure.  For models with "Triangular" PIMS, unused design data are
determined based on the lack of a release cohort.  For example, if there
were no capture history data that started with 0 and had a 1 in the second
position ("01.....") that would mean that there were no releases on occasion
2 and row 2 in the PIM would not be needed so it would be removed from the
design data.  If remove.unused=TRUE the design data are removed for
any missing cohorts within each group. For models with "Square" PIMS, cohort
structure is defined by a grouping variable.  If there is a field named
"cohort" within the design data, then unused design data are defined to
occur when time < cohort.  This is particularly useful for age structured
models which define birth cohorts.  In that case there will be sampling
times prior to the birth of the cohort which are not relevant and should be
treated as "structural zeros" and not as a zero based on stochastic events.
If the design data are removed, when the model is constructed with
make.mark.model, the argument default.fixed controls
what happens with the real parameters defined by the missing design data.
If default.fixed=TRUE, then the real parameters are fixed at values
that explain with certainty the observed data (e.g., p=0).  That is
necessary for models with "Square" PIMS (eg, POPAN and Pradel models) that
include each capture-history position in the probability calculation.  For
"Triangular" PIMS with "CJS" type models, the capture(encounter) history
probability is only computed for occasions past the first "1", the release.
Thus, when a cohort is missing there are no entries and the missing design
data are truly superfluous and default.fixed=FALSE will assign the
missing design data to a row in the design matrix which has all 0s.  That
row will show as a real parameter of (0.5 for a logit link) but it is not
included in any parameter count and does not affect any calculation. The
advantage in using this approach is that the R code recognizes these and
displays blanks for these missing parameters, so it makes for more readable
output when say every other cohort is missing. See
make.mark.model for more information about deleted design data
and what this means to development of model formula.
For design data of "Multistrata" models, additional fields are added to
represent strata.  A separate PIM is created for each stratum for each
parameter and this is denoted in the design data with the addition of the
factor variable stratum which has a level for each stratum.  In
addition, for each stratum a dummy variable is created with the name of the
stratum (strata.label)and it has value 1 when the parameter is for
that stratum and 0 otherwise.  Using these variables with the interaction
operator ":" in formula allows more flexibility in creating model structure
for some strata and not others.  All "Multistrata" models contain "Psi"
parameters which represent the transitions from a stratum to all other
strata.  Thus if there are 3 strata, there are 6 PIMS for the "Psi"
parameters to represent transition from A to B, A to C, B to A, B to C, C to
A and C to B.  The "Psi" parameters are represented by multimonial logit
links and the probability of remaining in the stratum is determined by
substraction.  To represent these differences, a factor variable
tostratum is created in addition to stratum. Likewise, dummy
variables are created for each stratum with names created by pasting "to"
and the strata label (e.g., toA, toB etc). Some examples of using these
variables to create models for "Psi" are given in
make.mark.model.
######WARNING######## Deleting design data for mlogit parameters like Psi in the multistate model can fail if you do things like delete certain transitions. Deleting design data is no longer allowed. It is better to add the field fix. It should be assigned the value NA for parameters that are estimated and a fixed real value for those that are fixed. Here is an example with the mstrata data example: data(mstrata) # deleting design data approach to fix Psi A to B to 0 (DON'T use this approach) dp=process.data(mstrata,model="Multistrata") ddl=make.design.data(dp) ddl$Psi=ddl$Psi[!(ddl$Psi$stratum=="A" & ddl$Psi$tostratum=="B"),] ddl$Psi summary(mark(dp,ddl,output=FALSE,delete=TRUE),show.fixed=TRUE) #new approach using fix to set Phi=1 for time 2 (USE this approach) ddl=make.design.data(dp) ddl$Psi$fix=NA ddl$Psi$fix[ddl$Psi$stratum=="A" & ddl$Psi$tostratum=="B"]=0 ddl$Psi summary(mark(dp,ddl,output=FALSE,delete=TRUE),show.fixed=TRUE)
Value
The function value is a list of data frames. The list contains a data frame for each type of parameter in the model (e.g., Phi and p for CJS). The names of the list elements are the parameter names (e.g., Phi). The structure of the dataframe depends on the calling arguments and the model & data structure as described in the details above.
Author(s)
Jeff Laake
See Also
process.data,merge_design.covariates,
add.design.data, make.mark.model,
run.mark.model
Examples
data(example.data)
proc.example.data=process.data(example.data)
ddl=make.design.data(proc.example.data)
ddl=add.design.data(proc.example.data,ddl,parameter="Phi",type="age",
  bins=c(0,.5,1,2,8),name="agebin")
ddl=add.design.data(proc.example.data,ddl,parameter="p",type="time",
  bins=c(1980,1983,1986),name="timebin")
Create a MARK model for analysis
Description
Creates a MARK model object that contains a MARK input file with PIMS and design matrix specific to the data and model structure and formulae specified for the model parameters.
Usage
make.mark.model(
  data,
  ddl,
  parameters = list(),
  title = "",
  model.name = NULL,
  initial = NULL,
  call = NULL,
  default.fixed = TRUE,
  options = NULL,
  profile.int = FALSE,
  chat = NULL,
  simplify = TRUE,
  input.links = NULL,
  parm.specific = FALSE,
  mlogit0 = TRUE,
  hessian = FALSE,
  accumulate = TRUE,
  icvalues = NULL,
  wrap = TRUE,
  nodes = 101,
  useddl = FALSE,
  check.model = FALSE
)
Arguments
| data | Data list resulting from function  | 
| ddl | Design data list from function  | 
| parameters | List of parameter formula specifications | 
| title | Title for the analysis (optional) | 
| model.name | Model name to override default name (optional) | 
| initial | Vector of named or unnamed initial values for beta parameters or previously run model (optional) | 
| call | Pass function call when this function is called from another
function (e.g. | 
| default.fixed | if TRUE, real parameters for which the design data have been deleted are fixed to default values | 
| options | character string of options for Proc Estimate statement in MARK .inp file | 
| profile.int | if TRUE, requests MARK to compute profile intervals | 
| chat | pre-specified value for chat used by MARK in calculations of model output | 
| simplify | if FALSE, does not simplify PIM structure | 
| input.links | specifies set of link functions for parameters with non-simplified structure | 
| parm.specific | if TRUE, forces a link to be specified for each parameter | 
| mlogit0 | if TRUE, any real parameter that is fixed to 0 and has an mlogit link will have its link changed to logit so it can be simplified | 
| hessian | if TRUE specifies to MARK to use hessian rather than second partial matrix | 
| accumulate | if TRUE accumulate like data values into frequencies | 
| icvalues | numeric vector of individual covariate values for computation of real values | 
| wrap | if TRUE, data lines are wrapped to be length 80; if length of a row is not a problem set to FALSE and it will run faster | 
| nodes | number of integration nodes for individual random effects (min 15, max 505, default 101) | 
| useddl | If TRUE and there are no missing rows or parameters (deleted) then it will use ddl in place of full.ddl that is created internally. | 
| check.model | if TRUE, code does an internal consistency check between PIMs and design data when making model. | 
Details
This function is called by mark to create the model but it can
be called directly to create but not run the model.  All of the arguments
have default values except for the first 2 that specify the processed data
list (data) and the design data list (ddl). If only these 2
arguments are specified default models are used for the parameters.  For
example, following with the example from process.data and
make.design.data, the default model can be created with:
mymodel=make.mark.model(proc.example.data,ddl)
The call to make.mark.model creates a model object but does not do
the analysis.  The function returns a list that includes several fields
including a design matrix and the MARK input file that will be used with
MARK.EXE to run the analysis from function
run.mark.model. The following shows the names of the list
elements in mymodel:
names(mymodel) [1] "data" "model" "title" "model.name" "links" [6] "mixtures" "call" "parameters" "input" "number.of.groups" [11] "group.labels" "nocc" "begin.time" "covariates" "fixed" [16] "design.matrix" "pims" "design.data" "strata.labels" "mlogit.list" [21] "simplify"
The list is defined to be a mark object which is done by assigning a class vector to the list. The classes for an R object can be viewed with the class function as shown below:
class(mymodel) [1] "mark" "CJS"
 Each MARK model has 2 class
values.  The first identifies it as a mark object and the second identifies
the type of mark analysis, which is the default "CJS" (recaptures only) in
this case.  The use of the class feature has advantages in using generic
functions and identifying types of objects.  An object of class mark
is defined in more detail in function mark.
To fit non-trivial models it is necessary to understand the remaining
calling arguments of make.mark.model and R formula notation. The
model formulae are specified with the calling argument parameters.
It uses a similar list structure as the parameters argument in
make.design.data. It expects to get a list with elements named
to match the parameters in the particular analysis (e.g., Phi and p in CJS)
and each list element is a list, so it is a list of lists).  For each
parameter, the possible list elements are formula, link, fixed,
component, component.name, remove.intercept. In addition, for closed
capture models and robust design model, the element share is included
in the list for p (capture probabilities) and GammaDoublePrime
(respectively) to indicate whether the model is shared (share=TRUE) or
not-shared (the default) (share=FALSE) with c (recapture probabilities) and
GammaPrime respectively.
formula specifies the model for the parameter using R formula
notation. An R formula is denoted with a ~ followed by variables in
an equation format possibly using the + , *, and :
operators.  For example, ~sex+age is an additive model with the main
effects of sex and age. Whereas, ~sex*age includes the
main effects and the interaction and it is equivalent to the formula
specified as ~sex+age+sex:age where sex:age is the interaction
term.  The model ~age+sex:age is slightly different in that the main
effect for sex is dropped which means that intercept of the
age effect is common among the sexes but the age pattern could vary
between the sexes.  The model ~sex*Age which is equivalent to
~sex + Age + sex:Age has only 4 parameters and specifies a linear
trend with age and both the intercept and slope vary by sex.  One additional
operator that can be useful is I() which allows computation of
variables on the fly.  For example, the addition of the Agesq variable in
the design data (as described above) can be avoided by using the notation
~Age + I(Age^2) which specifies use of a linear and quadratic effect
for age.  Note that specifying the model ~age + I(age^2) would be
incorrect and would create an error because age is a factor variable
whereas Age is not.
As an example, consider developing a model in which Phi varies by age and p follows a linear trend over time. This model could be specified and run as follows:
p.Time=list(formula=~Time) Phi.age=list(formula=~age) Model.p.Time.Phi.age=make.mark.model(proc.example.data,ddl, parameters=list(Phi=Phi.age,p=p.Time)) Model.p.Time.Phi.age=run.mark.model(Model.p.Time.Phi.age)
The first 2 commands define the p and Phi models that are used in the
parameter list in the call to make.mark.model. This is a good
approach for defining models because it clearly documents the models, the
definitions can then be used in many calls to make.mark.model and it
will allow a variety of models to be developed quickly by creating different
combinations of the parameter models.  Using the notation above with the
period separating the parameter name and the description (eg., p.Time) gives
the further advantage that all possible models can be developed quickly with
the functions create.model.list and
mark.wrapper.
Model formula can use any field in the design data and any individual
covariates defined in data.  The restrictions on individual
covariates that was present in versions before 1.3 have now been removed.
You can now use interactions of individual covariates with all design data
covariates and products of individual covariates.  You can specify
interactions of individual covariates and factor variables in the design
data with the formula notation.  For example, ~region*x1 describes a
model in which the slope of x1 varies by region. Also,
~time*x1 describes a model in which the slope for x1 varied by
time; however, there would be only one value of the covariate per animal so
this is not a time varying covariate model.  Models with time varying
covariates are now more easily described with the improvements in version
1.3 but there are restrictions on how the time varying individual covariates
are named.  An example would be a trap dependence model in which capture
probability on occasion i+1 depends on whether they were captured on
occasion i.  If there are n occasions in a CJS model, the 0/1 (not
caught/caught) for occasions 1 to n-1 would be n-1 individual covariates to
describe recapture probability for occasions 2 to n. For times 2 to n, a
design data field could be defined such that the variable timex is 1 if
time==x and 0 otherwise.  The time varying covariates must be named with a
time suffix on the base name of the covariate. In this example they would be
named as x2,. . .,xn and the model could be specified as ~time
+ x for time variation and a constant trap effect or as ~time +
time:x for a trap effect that varied by time.  If in the
process.data call, the argument begin.time was set to
the year 1990, then the variables would have to be named x1991,x1992,...
because the first recapture occasion would be 1991.  Note that the times are
different for different parameters.  For example, survival is labeled based
on the beginning of the time interval which would be 1990 so the variables
should be named appropriately for the parameter model in which they will be
used.
In previous versions to handle time varying covariates with a constant effect, it was necessary to use the component feature of the parameter specification to be able to append one or more arbitrary columns to the design matrix. That is no longer required for individual covariates and the component feature was removed in v2.0.8.
There are three other elements of the parameter list that can be useful on
occasion.  The first is link which specifies the link function for
transforming between the beta and real parameters.  The possible values are
"logit", "log", "identity" and "mlogit(*)" where * is a numeric identifier.
The "sin" link is not allowed because all models are specified using a
design matrix.  The typical default values are assigned to each parameter
(eg "logit" for probabilities, "log" for N, and "mlogit" for pent in POPAN),
so in most cases it will not be necessary to specify a link function.
The second is fixed which allows real parameters to be set at fixed
values.  While this is still allowed, if parameters are to be fixed for all models,
it is best to assign the field fix in the design dataframe and use NA for the values
to be estimated and the real fixed value for those values to be fixed. But if the 
values are only to be fixed for a few models then you can still use the fixed approach 
described here. The values for fixed can be either a single value or a list
with 5 alternate forms for ease in specifying the fixed parameters.
Specifying fixed=value will set all parameters of that type to the
specified value.  For example, Phi=list(fixed=1) will set all Phi to
1.  This can be useful for situations like specifying F in the
Burnham/Barker models to all have the same value of 1.  Fixed values can
also be specified as a list in which values are specified for certain
indices, times, ages, cohorts, and groups.  The first 3 will be the most
useful.  The first list format is the most general and flexible but it
requires an understanding of the PIM structure and index numbers for the
real parameters.  For example,
Phi=list(formula=~time, fixed=list(index=c(1,4,7),value=1))
specifies Phi varying by time, but the real parameters 1,4,7 are set to 1.
The value field is either a single constant or its length must match
the number of indices.  For example,
Phi=list(formula=~time, fixed=list(index=c(1,4,7),value=c(1,0,1)))
sets real parameters 1 and 7 to 1 and real parameter 4 to 0.  Technically,
the index/value format for fixed is not wedded to the parameter type (i.e.,
values for p can be assigned within Phi list), but for the sake of clarity
they should be restricted to fixing real parameters associated with the
particular parameter type.  The time and age formats for fixed
will probably be the most useful.  The format fixed=list(time=x, value=y)
will set all real parameters (of that type) for time x to value y.  For
example,
p=list(formula=~time,fixed=list(time=1986,value=1))
sets up time varying capture probability but all values of p for 1986 are set to 1. This can be quite useful to set all values to 0 in years with no sampling (e.g.,
fixed=list(time=c(1982,1984,1986), value=0)
).
The age, cohort and group formats work in a similar
fashion.  It is important to recognize that the value you specify for
time, age, cohort and group must match the
values in the design data list.  This is another reason to add binned fields
for age, time etc with add.design.data after creating the
default design data with make.design.data.  Also note that the
values for time and cohort are affected by the
begin.time argument specified in process.data.  Had I
not specified begin.time=1980, to set p in the last occasion (1986),
the specification would be
p=list(formula=~time,fixed=list(time=7,value=1))
because begin.time defaults to 1.  The advantage of the time-, age-, and
cohort- formats over the index-format is that it will work regardless of the
group definition which can easily be changed by changing the groups
argument in process.data.  The index-format will be dependent
on the group structure because the indexing of the PIMS will most likely
change with changes in the group structure.
Starting with version 3.0.0 deleting design data as deacribed in this paragraph is no longer allowed. Instead, add a fix field to the design data and set the fixed values as described above.
#### No longer supported###
Parameters can also be fixed at default values by deleting the specific rows
of the design data. See make.design.data and material below.
The default value for fixing parameters for deleted design data can be
changed with the default=value in the parameter list.
Another useful element of the parameter list is the
remove.intercept argument.  It is set to TRUE to forcefully remove
the intercept.  In R notation this can be done by specifying the formula
notation ~-1+... but in formula with nested interactions of factor variables
and additive factor variables the -1 notation will not remove the intercept.
It will simply adjust the column definitions but will keep the same number
of columns and the model will be over-parameterized.  The problem occurs with
nested factor variables like tostratum within stratum for multistrata
designs (see mstrata).  As shown in that example, you can
build a formula -1+stratum:tostratum to have transitions that are
stratum-specific.  If however you also want to add a sex effect and you
specify -1+sex+stratum:tostratum it will add 2 columns for sex labelled M
and F when in fact you only want to add one column because the intercept is
already contained within the stratum:tostratum term.  The argument
remove.intercept will forcefully remove the intercept but it needs to be
able to find a column with all 1's.  For example,
Psi=list(formula=~sex+stratum:tostratum,remove.intercept=TRUE) will work but
Psi=list(formula=~-1+sex+stratum:tostratum,remove.intercept=TRUE) will not
work.  Also, the -1 notation should be used when there is not an added
factor variable because
Psi=list(formula=~stratum:tostratum,remove.intercept=TRUE)
will not work because while the stratum:tostratum effectively includes an intercept it is equivalent to using an identity matrix and is not specified as treatment contrast with one of the columns as all 1's.
The final argument of the parameter list is contrast which can be used to change the contrast used for model.matrix. It uses the default if none specified. The form is shown in ?model.matrix.
The argument simplify determines whether the pims are simplified such that only indices for unique and fixed real parameters are used. For example, with an all different PIM structure with CJS with K occasions there are K*(K-1) real parameters for Phi and p. However, if you use simplify=TRUE with the default model of Phi(.)p(.), the pims are re-indexed to be 1 for all the Phi's and 2 for all the p's because there are only 2 unique real parameters for that model. Using simplify can speed analysis markedly and probably should always be used. This was left as an argument only to test that the simplification was working and produced the same likelihood and real parameter estimates with and without simplification. It only adjust the rows of the design matrix and not the columns. There are some restrictions for simplification. Real parameters that are given a fixed value are maintained in the design matrix although it does simplify among the fixed parameters. For example, if there are 50 real parameters all fixed to a value of 1 and 30 all fixed to a value of 0, they are reduced to 2 real parameters fixed to 1 and 0. Also, real parameters such as Psi in Multistrata and pent in POPAN that use multinomial logits are not simplified because they must maintain the structure created by the multinomial logit link. All other parameters in those models are simplified. The only downside of simplification is that the labels for real parameters in the MARK output are unreliable because there is no longer a single label for the real parameter because it may represent many different real parameters in the all-different PIM structure. This is not a problem with the labels in R because the real parameter estimates are translated back to the all-different PIM structure with the proper labels.
Be careful specifying interactions and make sure that all combinations are present when interacting 2 or more factor variables. For example, with 2 factors A and B, each with 2 levels, the data are fully crossed if there are data with values A1&B1, A1&B2, A2&B1 and A2&B2. If data exist for each of the 4 combinations then you can described the interaction model as ~A*B and it will estimate 4 values. However, if data are missing for one of more of the 4 cells then the "interaction" formula should be specified as ~-1+A:B or ~-1+B:A or ~-1+A the combinations with data. An example of this could be a marking program with multiple sites which resighted at all occasions but which only marked at sites on alternating occasions. In that case time is nested within site and time-site interaction models would be specified as ~-1+time:site. Another example is age*time when you only mark young of the year. In that case, the first occasion only contains young, the second occasion newly marked young and age 1 if occasions are a year apart. In that case, using something like above ~-1+time:age is useful.
The argument title supplies a character string that is used to label
the output. The argument model.name is a descriptor for the model
used in the analysis.  The code constructs a model name from the formula
specified in the call (e.g., Phi(~1)p(~time)) but on occasion the
name may be too long or verbose, so it can be over-ridden with the
model.name argument.
The final argument initial can be used to provide initial estimates
for the beta parameters. It is either 1) a single starting value for each
parameter, 2) an unnamed vector of values (one for each parameter), 3) named
vector of values, or 4) the name of mark object that has already been
run. For cases 3 and 4, the code only uses appropriate initial beta
estimates in which the column names of the design matrix (for model) or
vector names match the column names in the design matrix of the model to be
run.  Any remaining beta parameters without an initial value specified are
assigned a 0 initial value.  If case 4 is used the models must have the same
number of rows in the design matrix and thus presumably the same structure.
As long as the vector elements are named (#3), the length of the
initial vector no longer needs to match the number of parameters in
the new model as long as the elements are named. The names can be retrieved
either from the column names of the design matrix or from
rownames(x$results$beta) where x is the name of the
mark object.
options is a character string that is tacked onto the Proc
Estimate statement for the MARK .inp file.  It can be used to request
options such as NoStandDM (to not standardize the design matrix) or
SIMANNEAL (to request use of the simulated annealing optimization method) or
any existing or new options that can be set on the estimate proc.
Value
model: a MARK object except for the elements output and
results. See mark for a detailed description of the
list contents.
Author(s)
Jeff Laake
See Also
process.data,make.design.data,
run.mark.model mark
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
#
# Process data
#
dipper.processed=process.data(dipper,groups=("sex"))
#
# Create default design data
#
dipper.ddl=make.design.data(dipper.processed)
#
# Add Flood covariates for Phi and p that have different values
#
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1
#
#  Define range of models for Phi
#
Phidot=list(formula=~1)
Phitime=list(formula=~time)
PhiFlood=list(formula=~Flood)
#
#  Define range of models for p
#
pdot=list(formula=~1)
ptime=list(formula=~time)
#
# Make assortment of models
#
dipper.phidot.pdot=make.mark.model(dipper.processed,dipper.ddl,
  parameters=list(Phi=Phidot,p=pdot))
dipper.phidot.ptime=make.mark.model(dipper.processed,dipper.ddl,
  parameters=list(Phi=Phidot,p=ptime))
dipper.phiFlood.pdot=make.mark.model(dipper.processed,dipper.ddl,
  parameters=list(Phi=PhiFlood, p=pdot))
Make time-varying dummy variables from time-varying factor variable
Description
Create a new dataframe with time-varying dummy variables from a time-varying factor variable. The time-varying dummy variables are named appropriately to be used as a set of time dependent individual covariates in a parameter specification
Usage
make.time.factor(x, var.name, times, intercept = NULL, delete = TRUE)
Arguments
| x | dataframe containing set of factor variables with names composed of var.name prefix and times suffix | 
| var.name | prefix for variable names | 
| times | numeric suffixes for variable names | 
| intercept | the value of the factor variable that will be used for the intercept | 
| delete | if TRUE, the origninal time-varying factor variables are removed from the returned dataframe | 
Details
An example of the var.name and times is var.name="observer", times=1:5. The code expects to find observer1,...,observer5 to be factor variables in x. If there are k unique levels (excluding ".") across the time varying factor variables, then k-1 dummy variables are created for each of the named factor variables. They are named with var.name, level[i], times[j] concatenated together where level[i] is the name of the facto level i. If there a m times then the new data set will contain m*(k-1) dummy variables. If the factor variable includes any "." values these are ignored because they are used to indicate a missing value that is paired with a missing value in the encounter history. Note that it will create each dummy variable for each factor even if a particular level is not contained within a factor (eg observers 1 to 3 used but only 1 and 2 on occasion 1).
Value
x: a dataframe containing the original data (with time-varying factor variables removed if delete=TRUE) and the time-varying dummy variables added.
Author(s)
Jeff Laake
Examples
# see example in weta
Mallard nest survival example
Description
A nest survival data set on mallards. The data set and analysis is described by Rotella et al.(2004).
Format
A data frame with 565 observations on the following 13 variables.
- FirstFound
- the day the nest was first found 
- LastPresent
- the last day that chicks were present 
- LastChecked
- the last day the nest was checked 
- Fate
- the fate of the nest; 0=hatch and 1 depredated 
- Freq
- the frequency of nests with this data; always 1 in this example 
- Robel
- Robel reading of vegetation thickness 
- PpnGrass
- proportion grass in vicinity of nest 
- Native
- dummy 0/1 variable; 1 if native vegetation 
- Planted
- dummy 0/1 variable; 1 if planted vegetation 
- Wetland
- dummy 0/1 variable; 1 if wetland vegetation 
- Roadside
- dummy 0/1 variable; 1 if roadside vegetation 
- AgeFound
- age of nest in days the day the nest was found 
- AgeDay1
- age of nest at beginning of study 
Details
The first 5 fields must be named as they are shown for nest survival models.
Freq and the remaining fields are optional.  See
killdeer for more description of the nest survival data
structure and the use of the special field AgeDay1. The habitat
variables Native,Planted,Wetland,Roadside were
originally coded as 0/1 dummy variables to enable easier modelling with
MARK.  A better alternative in RMark is to create a single variable
habitat with values of "Native","Planted",
"Wetland", and "Roadside". Then the Hab model in the example
below would become:
Hab=mark(mallard,nocc=90,model="Nest", model.parameters=list(S=list(formula=~habitat)), groups="habitat")
For this example, that doesn't make a big difference but if you have more than one factor and you want to construct an interaction or you have a factor with many levels, then it is more efficient to use factor variables rather than dummy variables.
Author(s)
Jay Rotella
Source
Rotella, J.J., S. J. Dinsmore, T.L. Shaffer. 2004. Modeling nest-survival data: a comparison of recently developed methods that can be implemented in MARK and SAS. Animal Biodiversity and Conservation 27:187-204.
Examples
# Last updated September 28, 2019
# The original mallard example shown below uses idividual calls to mark function.
# This is not as efficient as using mark.wrapper and can cause difficulties if different
# groups arguments are used and model averaging is attempted.  Below this original example
# the more efficient approach is shown. 
# Read in data, which are in a simple text file that
# looks like a MARK input file but (1) with no comments or semicolons and
# (2) with a 1st row that contains column labels
# mallard=read.table("mallard.txt",header=TRUE)
# The mallard data set is also incuded with RMark and can be retrieved with
# data(mallard)
# This example is excluded from testing to reduce package check time
# ggplot commands have been commented out so RMark doesn't require ggplot
# scripted analysis of mallard nest-survival data in RMark
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Example of use of RMark for modeling nest survival data -   #
# Mallard nests example                                       #
# The example runs the 9 models that are used in the Nest     #
# Survival chapter of the Gentle Introduction to MARK and that#
# appear in Table 3 (page 198) of                             #
# Rotella, J.J., S. J. Dinsmore, T.L. Shaffer.  2004.         #
# Modeling nest-survival data: a comparison of recently       #
# developed methods that can be implemented in MARK and SAS.  #
#   Animal Biodiversity and Conservation 27:187-204.          #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# The mallard data set is also incuded with RMark and can be retrieved with
data(mallard)
# use the indicator variables for the 4 habitat types to yield
# 1 variable with habitat as a factor with 4 levels that
# can be used for a group variable in RMark
mallard$habitat <- ifelse(mallard$Native == 1, "Native",
                         ifelse(mallard$Planted == 1, "Planted",
                                ifelse(mallard$Roadside == 1, "Roadside",
                                       "Wetland")))
# make the new variable a factor
mallard$habitat <- as.factor(mallard$habitat)
mallard.pr <- process.data(mallard,
                          nocc=90,
                          model="Nest",
                          groups=("habitat"))
# Write a function for evaluating a set of competing models
run.mallard <- function()
{
 # 1. A model of constant daily survival rate (DSR)
 S.Dot = list(formula = ~1)
 
 # 2. DSR varies by habitat type - treats habitats as factors
 #  and the output provides S-hats for each habitat type
 S.Hab = list(formula = ~habitat)
 
 # 3. DSR varies with vegetation thickness (Robel reading)
 S.Robel = list(formula = ~Robel)
 
 # 4. DSR varies with the amount of native vegetation in the surrounding area
 S.PpnGr = list(formula = ~PpnGrass)
 
 # 5. DSR follows a trend through time
 S.TimeTrend = list(formula = ~Time)
 
 # 6. DSR varies with nest age
 S.Age = list(formula = ~NestAge)
 
 # 7. DSR varies with nest age & habitat type
 S.AgeHab = list(formula = ~NestAge + habitat)
 
 # 8. DSR varies with nest age & vegetation thickness
 S.AgeRobel = list(formula = ~NestAge + Robel)
 
 # 9. DSR varies with nest age & amount of native vegetation in
 #  surrounding area
 S.AgePpnGrass = list(formula = ~NestAge + PpnGrass)
 
 # Return model table and list of models
 
 mallard.model.list = create.model.list("Nest")
 
 mallard.results = mark.wrapper(mallard.model.list,
                                data = mallard.pr,
                                adjust = FALSE,delete=TRUE)
}
# The next line runs the 9 models above and takes a minute or 2
mallard.results <- run.mallard()
mallard.results
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Examine table of model-selection results #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# next line exports files; emove # and delete=TRUE in mark.wrapper above
#export.MARK(mallard.results$S.Age$data,
#            "MallDSR",
#            mallard.results,
#            replace = TRUE,
#            ind.covariates = "all")
mallard.results                        # print model-selection table to screen
options(width = 100)                   # set page width to 100 characters
#sink("results.table.txt")              # capture screen output to file
#print(mallard.results)                 # send output to file
#sink()                                 # return output to screen
# remove "#" on next line to see output in notepad in Windows             
# system("notepad results.table.txt",invisible=FALSE,wait=FALSE)
# remove "#" on next line to see output in texteditor editor on Mac
# system("open -t  results.table.txt", wait = FALSE)
names(mallard.results)
mallard.results$S.Dot$results$beta
mallard.results$S.Dot$results$real
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Examine output for 'DSR by habitat' model #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Remove "#" on next line to see output
# mallard.results$S.Hab                  # print MARK output to designated text editor
mallard.results$S.Hab$design.matrix      # view the design matrix that was used
mallard.results$S.Hab$results$beta       # view estimated beta for model in R
mallard.results$S.Hab$results$beta.vcv   # view variance-covariance matrix for beta's
mallard.results$S.Hab$results$real       # view the estimates of Daily Survival Rate
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Examine output for best model #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Remove "#" on next line to see output
# mallard.results$AgePpnGrass            # print MARK output to designated text editor
mallard.results$S.AgePpnGrass$results$beta # view estimated beta's in R
mallard.results$S.AgePpnGrass$results$beta.vcv # view estimated var-cov matrix in R
# To obtain estimates of DSR for various values of 'NestAge' and 'PpnGrass'
#   some work additional work is needed.
# Store model results in object with simpler name
AgePpnGrass <- mallard.results$S.AgePpnGrass
# Build design matrix with ages and ppn grass values of interest
# Relevant ages are age 1 to 35 for mallards
# For ppngrass, use a value of 0.5
fc <- find.covariates(AgePpnGrass,mallard)
fc$value[1:35] <- 1:35                      # assign 1:35 to 1st 35 nest ages
fc$value[fc$var == "PpnGrass"] <- 0.1       # assign new value to PpnGrass
design <- fill.covariates(AgePpnGrass, fc)  # fill design matrix with values
# extract 1st 35 rows of output
AgePpn.survival <- compute.real(AgePpnGrass, design = design)[1:35, ]
# insert covariate columns
AgePpn.survival <- cbind(design[1:35, c(2:3)], AgePpn.survival)     
colnames(AgePpn.survival) <- c("Age", "PpnGrass","DSR", "seDSR", "lclDSR",
                              "uclDSR")
# view estimates of DSR for each age and PpnGrass combo
AgePpn.survival
#library(ggplot2)
#ggplot(AgePpn.survival, aes(x = Age, y = DSR)) +
#  geom_line() +
#  geom_ribbon(aes(ymin = lclDSR, ymax = uclDSR), alpha = 0.3) +
#  xlab("Nest Age (days)") +
#  ylab("Estimated DSR") +
#  theme_bw()
# assign 17 to 1st 50 nest ages
fc$value[1:89] <- 17                     
# assign range of values to PpnGrass
fc$value[fc$var == "PpnGrass"] <- seq(0.01, 0.99, length = 89)
# fill design matrix with values
design <- fill.covariates(AgePpnGrass,fc)
AgePpn.survival <- compute.real(AgePpnGrass, design = design)
# insert covariate columns
AgePpn.survival <- cbind(design[ , c(2:3)], AgePpn.survival)     
colnames(AgePpn.survival) <-
 c("Age", "PpnGrass", "DSR", "seDSR", "lclDSR", "uclDSR")
# view estimates of DSR for each age and PpnGrass combo   
AgePpn.survival   
# Plot results
#ggplot(AgePpn.survival, aes(x = PpnGrass, y = DSR)) +
#  geom_line() +
#  geom_ribbon(aes(ymin = lclDSR, ymax = uclDSR), alpha = 0.3) +
#  xlab("Proportion Grass on Site") +
#  ylab("Estimated DSR") +
#  theme_bw()
# If you want to clean up the mark*.inp, .vcv, .res and .out
#  and .tmp files created by RMark in the working directory,
#  execute 'rm(list = ls(all = TRUE))' - see 2 lines below.
# NOTE: this will delete all objects in the R session.
# rm(list = ls(all=TRUE))
# Then, execute 'cleanup(ask = FALSE)' to delete orphaned output
#  files from MARK. Execute '?cleanup' to learn more
# cleanup(ask = FALSE)
Interface to MARK for fitting capture-recapture models
Description
Fits user specified models to various types of capture-recapture data by creating input file and running MARK software and retrieving output
Usage
mark(
  data,
  ddl = NULL,
  begin.time = 1,
  model.name = NULL,
  model = "CJS",
  title = "",
  model.parameters = list(),
  initial = NULL,
  design.parameters = list(),
  right = TRUE,
  groups = NULL,
  age.var = NULL,
  initial.ages = 0,
  age.unit = 1,
  time.intervals = NULL,
  nocc = NULL,
  output = TRUE,
  invisible = TRUE,
  adjust = TRUE,
  mixtures = 1,
  se = FALSE,
  filename = NULL,
  prefix = "mark",
  default.fixed = TRUE,
  silent = FALSE,
  retry = 0,
  options = NULL,
  brief = FALSE,
  realvcv = FALSE,
  delete = FALSE,
  external = FALSE,
  profile.int = FALSE,
  chat = NULL,
  reverse = FALSE,
  run = TRUE,
  input.links = NULL,
  parm.specific = FALSE,
  mlogit0 = TRUE,
  threads = -1,
  hessian = FALSE,
  accumulate = TRUE,
  allgroups = FALSE,
  strata.labels = NULL,
  counts = NULL,
  icvalues = NULL,
  wrap = TRUE,
  events = NULL,
  nodes = 101,
  useddl = FALSE,
  check.model = FALSE
)
Arguments
| data | Either the raw data which is a dataframe with at least one column named ch (a character field containing the capture history) or a processed dataframe | 
| ddl | Design data list which contains a list element for each parameter type; if NULL it is created | 
| begin.time | Time of first capture(release) occasion | 
| model.name | Optional name for the model | 
| model | Type of c-r model (eg CJS, Burnham, Barker) | 
| title | Optional title for the MARK analysis output | 
| model.parameters | List of model parameter specifications | 
| initial | Optional vector of named or unnamed initial values for beta parameters or previously run model object | 
| design.parameters | Specification of any grouping variables for design data for each parameter | 
| right | if TRUE, any intervals created in design.parameters are closed on the right and open on left and vice-versa if FALSE | 
| groups | Vector of names factor variables for creating groups | 
| age.var | Optional index in groups vector of a variable that represents age | 
| initial.ages | Optional vector of initial ages for each age level | 
| age.unit | Increment of age for each increment of time | 
| time.intervals | Intervals of time between the capture occasions | 
| nocc | number of occasions for Nest model; either time.intervals or nocc must be specified for this model | 
| output | If TRUE produces summary of model input and model output | 
| invisible | If TRUE, window for running MARK is hidden | 
| adjust | If TRUE, adjusts npar to number of cols in design matrix, modifies AIC and records both | 
| mixtures | number of mixtures for heterogeneity model or number of secondary samples for MultScaleOcc model | 
| se | if TRUE, se and confidence intervals are shown in summary sent to screen | 
| filename | base filename for files created by MARK.EXE. Files are named filename.*. | 
| prefix | base filename prefix for files created by MARK.EXE; for example if prefix="SpeciesZ" files are named "SpeciesZnnn.*" | 
| default.fixed | if TRUE, real parameters for which the design data have been deleted are fixed to default values | 
| silent | if TRUE, errors that are encountered are suppressed | 
| retry | number of reanalyses to perform with new starting values when one or more parameters are singular | 
| options | character string of options for Proc Estimate statement in MARK .inp file | 
| brief | if TRUE and output=TRUE then a brief summary line is given instead of a full summary for the model | 
| realvcv | if TRUE the vcv matrix of the real parameters is extracted and stored in the model results | 
| delete | if TRUE the output files are deleted after the results are extracted | 
| external | if TRUE the mark object is saved externally rather than in the workspace; the filename is kept in its place | 
| profile.int | if TRUE will compute profile intervals for each real parameter; or you can specify a vector of real parameter indices | 
| chat | value of chat used for profile intervals | 
| reverse | if set to TRUE, will reverse timing of transition (Psi) and survival (S) in Multistratum models(starting with version 3.0.0 setting reverse=TRUE is not allowed) | 
| run | if FALSE does not run model after creation | 
| input.links | specifies set of link functions for parameters with non-simplified structure | 
| parm.specific | if TRUE, forces a link to be specified for each parameter | 
| mlogit0 | if TRUE, any real parameter that is fixed to 0 and has an mlogit link will have its link changed to logit so it can be simplified | 
| threads | number of cpus to use with mark.exe if positive or number of cpus to remain idle if negative | 
| hessian | if TRUE specifies to MARK to use hessian rather than second partial matrix | 
| accumulate | if TRUE accumulate like data values into frequencies | 
| allgroups | Logical variable; if TRUE, all groups are created from
factors defined in  | 
| strata.labels | vector of single character values used in capture history(ch) for ORDMS, CRDMS, RDMSOccRepro models; it can contain one more value beyond what is in ch for an unobservable state except for RDMSOccRepro which is used to specify strata ordering (eg 0 not-occupied, 1 occupied no repro, 2 occupied with repro. | 
| counts | named list of numeric vectors (one group) or matrices (>1 group) containing counts for mark-resight models | 
| icvalues | numeric vector of individual covariate values for computation of real values | 
| wrap | if TRUE, data lines are wrapped to be length 80; if length of a row is not a problem set to FALSE and it will run faster | 
| events | vector of character events for Hidden Markov models | 
| nodes | number of integration nodes for individual random effects (min 15, max 505, default 101) | 
| useddl | if TRUE and no rows of ddl are missing (deleted) then it will use ddl in place of full.ddl that is created internally. | 
| check.model | if TRUE, code does an internal consistency check between PIMs and design data when making model. | 
Details
This function acts as an interface to the FORTRAN program MARK written by
Gary White (http://www.phidot.org/software/mark/). It
creates the input file for MARK based on user-specified sub-models
(model.parameters) for each of the parameters in the
capture-recapture model being fitted to the data. It runs MARK.EXE (see note
below) and then imports the text output file and binary variance-covariance
file that were created.  It extracts output values from the text file and
creates a list of results that is returned as part of the list (of class
mark) which is the return value for this function.
The models that are currently supported are listed in MarkModels.pdf which you can find in the RMark sub-directory of your R Library. Also, they are listed under Help/Data Types in the MARK interface.
The function mark is a shell that calls 5 other functions in the following
order as needed: 1) process.data, 2)
make.design.data, 3) make.mark.model, 4)
run.mark.model, and 5) summary.mark. A MARK
model can be fitted with this function (mark) or by calling the
individual functions that it uses.  The calling arguments for mark
are a compilation of the calling arguments for each of the functions it
calls (with some arguments renamed to avoid conflicts). If data is a
processed dataframe (see process.data) then it expects to find
a value for ddl.  Likewise, if the data have not been processed, then
ddl should be NULL.  This dual calling structure allows either a
single call approach for each model or alternatively for the data to be
processed and the design data (ddl) to be created once and then a
whole series of models can be analyzed without repeating those steps.
For descriptions of the arguments data, begin.time,
groups, age.var, initial.ages, age.unit,
time.intervals and mixtures see process.data.
For descriptions of ddl, design.parameters=parameters,
and right, see make.design.data.
For descriptions of model.name , model,
title,model.parameters=parameters ,
default.fixed , initial, options, see
make.mark.model.
And finally, for descriptions of arguments invisible, filename
and adjust,see run.mark.model.
output,silent, and retry are the only arguments
specific to mark.  output controls whether a summary of the model
input and output are given(if output=TRUE). silent controls
whether errors are shown when fitting a model. retry controls the
number of times a model will be refitted with new starting values (uses 0)
when some parameters are determined to be non-estimable or at a boundary.
The latter is the only time it makes sense to retry with new starting values
but MARK cannot discern between these two instances. The indices of the beta
parameters that are "singular" are stored in results$singular.
Value
model: a MARK object containing output and extracted results. It is a list with the following elements
| data | name of the processed data frame | 
| model | type of analysis model (see list above) | 
| title | title used for analysis | 
| model.name | descriptive name of model | 
| links | vector of link function(s) used for parameters, one for each row in design matrix or only one if all parameters use the same function | 
| mixtures | number of mixtures in Pledger-style closed capture-recapture models | 
| call | call to make.mark.model used to construct the model | 
| parameters | a list of parameter descriptions including the formula, pim.type, link etc. | 
| model.parameters | the list
of parameter descriptions used in the call to mark; this is used only by
 | 
| time.intervals | Intervals of time between the capture occasions | 
| number.of.groups | number of groups defined in the data | 
| group.labels | vector of labels for the groups | 
| nocc | number of capture occasions | 
| begin.time | single time of vector of times (if different for groups) for the first capture occasion | 
| covariates | vector of covariate names (as strings) used in the model | 
| fixed | dataframe of parameters set at fixed values;  | 
| design.matrix | design matrix used in the input to MARK.EXE | 
| pims | list of pims used for each parameter including any group or strata designations; each parameter in list is denoted by name and within each parameter one or more sub-lists represent groups and strata if any | 
| design.data | design data used to construct the design matrix | 
| strata.labels | labels for strata if any | 
| mlogit.list | structure used to simplify parameters that use mlogit links | 
| simplify | list containing  | 
| results | List of values extracted from MARK ouput | 
| lnl | -2xLog Likelihood value | |
| npar | Number of parameters (always the number of columns in design matrix) | |
| npar.unadjusted | number of estimated parameters from MARK if different than npar | |
| n | effective sample size | |
| AICc | Small sample corrected AIC using npar | |
| AICc.unadjusted | Small sample corrected AIC using npar.unadjusted | |
| beta | data frame of beta parameters with estimate, se, lcl, ucl | |
| real | data frame of real parameters with estimate, se, lcl, ucl and fixed | |
| beta.vcv | variance-covariance matrix for beta | |
| derived | dataframe of derived parameters if any | |
| derived.vcv | variance-covariance matrix for derived parameters if any | |
| covariate.values | dataframe with fields VariableandValue | |
| which are the covariate names and value used for real parameter | ||
| estimates in the MARK output | ||
| singular | indices of beta parameters that are non-estimable or at a boundary | |
| real.vcv | variance-covariance matrix for real parameters (simplified) if realvcv=TRUE | |
| chat | over-dispersion constant; if not present assumed to be 1 | 
| output | base portion of filenames for input,output, vc and residual files output from MARK.EXE | 
Note
It is assumed that MARK.EXE is located in directory "C:/Program
Files/Mark".  There are now mark32.exe and mark64.exe files and the 32 or 64 bit versions
of R determine which is used.  You can force one or the other by naming the file mark.exe
but probably no reason to do so. If it is in a different location set the variable MarkPath to
the directory location. For example, seting MarkPath="C:/Mark/" at the R
prompt will assign run "c:/mark/markxx.exe" where xx is 32 or 64 to do the analysis.  If you have
chosen a non-default path for Mark.exe, MarkPath needs to be defined for
each R session.  It is easiest to do this assignment automatically by
putting the MarkPath assignment into your .First function which is run each
time an R session is initiated.  In addition to MarkPath, the variable
MarkViewer can be assigned to a program other than notepad.exe (see
print.mark).
Author(s)
Jeff Laake
See Also
make.mark.model, run.mark.model,
make.design.data, process.data,
summary.mark
Examples
data(dipper)
dipper.Phidot.pdot=mark(dipper,threads=1,delete=TRUE)
Constructs and runs a set of MARK models from a dataframe of parameter specifications
Description
This is a convenience function that uses a dataframe of parameter
specifications created by create.model.list and it constructs
and runs each model and names the models by concatenating each of the
parameter specification names separated by a period.  The results are
returned as a marklist with a model.table constructed by
collect.models.
Usage
mark.wrapper(
  model.list,
  silent = FALSE,
  run = TRUE,
  use.initial = FALSE,
  initial = NULL,
  ...
)
Arguments
| model.list | a dataframe of parameter specification names in the calling frame | 
| silent | if TRUE, errors that are encountered are suppressed | 
| run | if FALSE, only calls  | 
| use.initial | if TRUE, initial values are constructed for new models using completed models that have already been run in the set | 
| initial | vector, mark model or marklist for defining initial values | 
| ... | arguments to be passed to  | 
Details
The model names in model.list must be in the frame of the function
that calls run.models. If model.list=NULL or the MARK models
are collected from the frame of the calling function (the parent). If
type is specified only the models of that type (e.g., "CJS") are run.
In each case the models are run and saved in the parent frame. To fully
understand, how this function works and its limitations, see
create.model.list.
If use.initial=TRUE, prior to running a model it looks for the first
model that has already been run (if any) for each parameter formula and
constructs an initial vector from that previous run. For example, if
you provided 5 models for p and 3 for Phi in a CJS model, as soon as the
first model for p is run, in the subsequent 2 models with different Phi
models, the initial values for p are assigned based on the run with the
first Phi model.  At the outset this seemed like a good idea to speed up
execution times, but from the one set of examples I ran where several
parameters were at boundaries, the results were discouraging because the
models converged to a sub-optimal likelihood value than the runs using the
default initial values.  I've left sthis option in but set its default value
to FALSE.
A possibly more useful argument is the argument initial.  Previously,
you could use initial=model as part of the ... arguments and it would
use the estimates from that model to assign initial values for any model in
the set. Now I've defined initial as a specific argument and it can
be used as above or you can also use it to specify a marklist of
previously run models. When you do that, the code will lookup each new model
to be run in the set of models specified by initial and if it finds
one with the matching name then it will use the estimates for any matching
parameters as initial values in the same way as initial=model does.
The model name is based on concatenating the names of each of the parameter
specification objects.  To make this useful, you'll want to adapt to an
approach that I've started to use of naming the objects something like
p.1,p.2 etc rather than naming them something like p.dot, p.time as done in
many of the examples.  I've found that using numeric approach is much less
typing and cumbersome rather than trying to reflect the formula in the name.
By default, the formula is shown in the model selection results table, so it
was a bit redundant.  Now where I see this being the most benefit.
Individual covariate models tend to run rather slowly. So one approach is to
run the sequence of models (eg results stored in initial_marklist),
including the set of formulas with all of the variables other than
individual covariates.  Then run another set with the same numbering scheme,
but adding the individual covariates to the formula and using
initial=initial_marklist That will work if each parameter
specification has the same name (eg., p.1=list(formula=~time) and then
p.1=list(formula=~time+an_indiv_covariate)).  All of the initial values will
be assigned for the previous run except for any added parameters (eg.
an_indiv_covariate) which will start with a 0 initial value.
Value
if(run) marklist - list of mark models that were run and a model.table of results; if(!run) a list of models that were constructed but not run.
Author(s)
Jeff Laake
See Also
collect.models, mark,
create.model.list
Constructs and runs in parallel a set of MARK models from a dataframe of parameter specifications
Description
This is a convenience function that uses a dataframe of parameter
specifications created by create.model.list and it constructs
and runs each model and names the models by concatenating each of the
parameter specification names separated by a period.  The results are
returned as a marklist with a model.table constructed by
collect.models.
Usage
mark.wrapper.parallel(
  model.list,
  silent = FALSE,
  use.initial = FALSE,
  initial = NULL,
  parallel = TRUE,
  cpus = 2,
  threads = 1,
  ...
)
Arguments
| model.list | a dataframe of parameter specification names in the calling frame | 
| silent | if TRUE, errors that are encountered are suppressed | 
| use.initial | if TRUE, initial values are constructed for new models using completed models that have already been run in the set | 
| initial | vector, mark model or marklist for defining initial values | 
| parallel | if TRUE, runs models in parallel on multiple cpus | 
| cpus | number of cpus to use in parallel | 
| threads | number of cpus to use with mark.exe if positive or number of cpus to remain idle if negative | 
| ... | arguments to be passed to  | 
Details
The model names in model.list must be in the frame of the function
that calls run.models. If model.list=NULL or the MARK models
are collected from the frame of the calling function (the parent). If
type is specified only the models of that type (e.g., "CJS") are run.
In each case the models are run and saved in the parent frame. To fully
understand, how this function works and its limitations, see
create.model.list.
If use.initial=TRUE, prior to running a model it looks for the first
model that has already been run (if any) for each parameter formula and
constructs an initial vector from that previous run. For example, if
you provided 5 models for p and 3 for Phi in a CJS model, as soon as the
first model for p is run, in the subsequent 2 models with different Phi
models, the initial values for p are assigned based on the run with the
first Phi model.  At the outset this seemed like a good idea to speed up
execution times, but from the one set of examples I ran where several
parameters were at boundaries, the results were discouraging because the
models converged to a sub-optimal likelihood value than the runs using the
default initial values.  I've left sthis option in but set its default value
to FALSE.
A possibly more useful argument is the argument initial.  Previously,
you could use initial=model as part of the ... arguments and it would
use the estimates from that model to assign initial values for any model in
the set. Now I've defined initial as a specific argument and it can
be used as above or you can also use it to specify a marklist of
previously run models. When you do that, the code will lookup each new model
to be run in the set of models specified by initial and if it finds
one with the matching name then it will use the estimates for any matching
parameters as initial values in the same way as initial=model does.
The model name is based on concatenating the names of each of the parameter
specification objects.  To make this useful, you'll want to adapt to an
approach that I've started to use of naming the objects something like
p.1,p.2 etc rather than naming them something like p.dot, p.time as done in
many of the examples.  I've found that using numeric approach is much less
typing and cumbersome rather than trying to reflect the formula in the name.
By default, the formula is shown in the model selection results table, so it
was a bit redundant.  Now where I see this being the most benefit.
Individual covariate models tend to run rather slowly. So one approach is to
run the sequence of models (eg results stored in initial_marklist),
including the set of formulas with all of the variables other than
individual covariates.  Then run another set with the same numbering scheme,
but adding the individual covariates to the formula and using
initial=initial_marklist That will work if each parameter
specification has the same name (eg., p.1=list(formula=~time) and then
p.1=list(formula=~time+an_indiv_covariate)).  All of the initial values will
be assigned for the previous run except for any added parameters (eg.
an_indiv_covariate) which will start with a 0 initial value.
Value
marklist - list of mark models that were run and a model.table of results
Author(s)
Eldar Rakhimberdiev
See Also
collect.models, mark,
create.model.list
Examples
# example not run to reduce time required for checking
do.MSOccupancy=function()
{
#  Get the data
	data(NicholsMSOccupancy)
# Define the models; default of Psi1=~1 and Psi2=~1 is assumed
# p varies by time but p1t=p2t
	p1.p2equal.by.time=list(formula=~time,share=TRUE)
# time-varying p1t and p2t
	p1.p2.different.time=list(p1=list(formula=~time,share=FALSE),p2=list(formula=~time))
#  delta2 model with one rate for times 1-2 and another for times 3-5;
# delta2 defined below
	Delta.delta2=list(formula=~delta2)
	Delta.dot=list(formula=~1)  # constant delta
	Delta.time=list(formula=~time) # time-varying delta
# Process the data for the MSOccupancy model
	NicholsMS.proc=process.data(NicholsMSOccupancy,model="MSOccupancy")
# Create the default design data
	NicholsMS.ddl=make.design.data(NicholsMS.proc)
# Add a field for the Delta design data called delta2.  It is a factor variable
# with 2 levels: times 1-2, and times 3-5.
	NicholsMS.ddl=add.design.data(NicholsMS.proc,NicholsMS.ddl,"Delta",
			type="time",bins=c(0,2,5),name="delta2")
# Create a list using the 4 p modls and 3 delta models (12 models total)
	cml=create.model.list("MSOccupancy")
# Fit each model in the list and return the results
	return(mark.wrapper.parallel(cml,data=NicholsMS.proc,ddl=NicholsMS.ddl,
    cpus=2,parallel=TRUE,delete=TRUE))
}
xx=do.MSOccupancy()
Model-Averaged Tail Area Wald (MATA-Wald) confidence intervals
Description
A generic function to compute model averaged estimates and their standard errors or variance-covariance matrix model-averaged tail area (MATA) construction.
Usage
mata.wald(theta.hats, se.theta.hats, model.weights, normal.lm=FALSE, 
                          residual.dfs=0, alpha=0.025)
 
       tailarea.z(theta, theta.hats, se.theta.hats, model.weights, alpha)
       tailarea.t(theta, theta.hats, se.theta.hats, model.weights, alpha, residual.dfs)
Arguments
| theta.hats | A vector containing the estimates of theta under each candidate model. | 
| se.theta.hats | A vector containing the estimated standard error of each estimate in 'theta.hats'. | 
| model.weights | A vector containing the model weights for each candidate model. Calculated from an information criterion, such as AIC. See Turek and Fletcher (2012) for details of calculation. All model weights must be non-negative, and sum to one. | 
| normal.lm | Specify normal.lm=TRUE for the normal linear model case, and normal.lm=FALSE otherwise. When normal.lm=TRUE, the argument 'residual.dfs' must also be supplied. See USAGE section, and Turek and Fletcher (2012) for additional details. | 
| residual.dfs | A vector containing the residual (error) degrees of freedom under each candidate model. This argument must be provided when the argument normal.lm=TRUE. | 
| alpha | The desired lower and upper error rate. Specifying alpha=0.025 corresponds to a 95 alpha=0.05 to a 90 Default value is alpha=0.025. | 
| theta | value for root finding in tailarea.z and tailarea.t | 
Details
The main function, mata.wald(...), may be used to construct model-averaged confidence intervals, using the model-averaged tail area (MATA) construction. The idea underlying this construction is similar to that of a model-averaged Bayesian credible interval. This function returns the lower and upper confidence limits of a MATA-Wald interval.
Two usages are supported. For the normal linear model case, and quantity of interest theta, > mata.wald(theta.hats, se.theta.hats, model.weights, alpha, normal.lm=TRUE, residual.dfs) returns a (1-2*alpha)100 Corresponds to the solutions of equations (2) and (3) of Turek and Fletcher (2012). The argument 'residual.dfs' is required for this usage.
When the sampling distribution for the estimate of theta is asymptotically normal (e.g. MLEs), possibly after a transformation, > mata.wald(theta.hats, se.theta.hats, model.weights, alpha, normal.lm=FALSE) returns a (1-2*alpha)100 on a transformed scale. Back-transformation of both confidence limits may be necessary. Corresponds to solutions to the equations in Section 3.2 of Turek and Fletcher (2012).
Author(s)
Daniel Turek<danielturek@gmail.com>
References
Turek, D. and Fletcher, D. (2012). Model-Averaged Wald Confidence Intervals. Computational Statistics and Data Analysis, 56(9), p.2809-2815.
Examples
# The example code below, uncommented, generates single-model Wald 
# and model-averaged MATA-Wald 95% confidence intervals for theta.
#
#    EXAMPLE: Normal linear prediction
#    =================================
#
# Data 'y', covariates 'x1' and 'x2', all vectors of length 'n'.
# 'y' taken to have a normal distribution.
# 'x1' specifies treatment/group (factor).
# 'x2' a continuous covariate.
#
# Take the quantity of interest (theta) as the predicted response 
# (expectation of y) when x1=1 (second group/treatment), and x2=15.
n = 20                              # 'n' is assumed to be even
x1 = c(rep(0,n/2), rep(1,n/2))      # two groups: x1=0, and x1=1
x2 = rnorm(n, mean=10, sd=3)
y = rnorm(n, mean = 3*x1 + 0.1*x2)  # data generation
x1 = factor(x1)
m1 = glm(y ~ x1)                    # using 'glm' provides AIC values.
m2 = glm(y ~ x1 + x2)               # using 'lm' doesn't.
aic = c(m1$aic, m2$aic)
delta.aic = aic - min(aic)
model.weights = exp(-0.5*delta.aic) / sum(exp(-0.5*delta.aic))
residual.dfs = c(m1$df.residual, m2$df.residual)
p1 = predict(m1, se=TRUE, newdata=list(x1=factor(1), x2=15))
p2 = predict(m2, se=TRUE, newdata=list(x1=factor(1), x2=15))
theta.hats = c(p1$fit, p2$fit)
se.theta.hats = c(p1$se.fit, p2$se.fit)
#  AIC model weights
model.weights
#  95% Wald confidence interval for theta (under Model 1)
theta.hats[1] + c(-1,1)*qt(0.975, residual.dfs[1])*se.theta.hats[1]
#  95% Wald confidence interval for theta (under Model 2)
theta.hats[2] + c(-1,1)*qt(0.975, residual.dfs[2])*se.theta.hats[2]
#  95% MATA-Wald confidence interval for theta (model-averaging)
mata.wald(theta.hats=theta.hats, se.theta.hats=se.theta.hats, 
        model.weights=model.weights, normal.lm=TRUE, residual.dfs=residual.dfs)
Merge mark model objects and lists of mark model objects
Description
Merge an unspecified number of marklist and mark model objects into a single
marklist with an optional table of model results if table=TRUE.
Usage
## S3 method for class 'mark'
merge(...,table=TRUE)
Arguments
| ... | an unspecified number of marklist and/or mark model objects | 
| table | if TRUE, a table of model results is also included in the returned list | 
Value
model.list: a list of mark models and optionally a table of
model results.
Author(s)
Jeff Laake
See Also
collect.models,remove.mark,run.models,model.table,dipper
Examples
# see example in dipper
Merge time (occasion) and/or group specific covariates into design data
Description
Adds new design data fields from a dataframe into a design data list
(ddl) by matching via time and/or group field in the design data.
Usage
merge_design.covariates(ddl, df, bygroup = FALSE, bytime = TRUE)
Arguments
| ddl | current design dataframe for a specific parameter and not the entire design data list (ddl); see example below | 
| df | dataframe with time(occasion) and/or group-specific data | 
| bygroup | logical; if TRUE, then a field named  | 
| bytime | logical; if TRUE, then a field named  | 
Details
Design data can be added to the parameter specific design dataframes with R
commands.  This function simplifies the process by enabling the merging of a
dataframe with a time and/or group field and one or more time and/or group
specific covariates into the design data list for a specific model
parameter. This is a replacement for the older function
merge.occasion.data. Unlike the older function, it uses the R
function merge but before merging it makes sure all of the
fields exist and that you are not merging data that already exists in the
design data.  It also maintains the row names in the case where design data
have been deleted prior to merging the design covariate data.
If bytime=TRUE,the dataframe df must have a field named
time that matches 1-1 for each value of time in the design
data list (ddl).  All fields in df (other than time/group) are
added to the design data.  If you set bygroup=TRUE and have a field
named group in df and its values match the group fields in the
design data then group-specific values can be assigned for each time if
bytime=TRUE. If bygroup=TRUE and bytime=FALSE then it
matches by group and not by time.
Value
Design dataframe (for a particular parameter) with new fields added.
See make.design.data for a description of the design data list
structure. The return value is only one element in the list rather than the
entire list as with the older function merge.occasion.data.
Author(s)
Jeff Laake
See Also
make.design.data, process.data,
add.design.data
Examples
data(dipper)
dipper.proc=process.data(dipper)
ddl=make.design.data(dipper.proc)
df=data.frame(time=c(1:7),effort=c(10,5,2,8,1,2,3))
# note that the value for time 1 is superfluous for CJS but not for POPAN
# the value 10 will not appear in the summary because there is no p for time 1
summary(ddl$p)
ddl$p=merge_design.covariates(ddl$p,df)
summary(ddl$p)
# This example is excluded from testing to reduce package check time
#
# Assign group-specific values
#
dipper.proc=process.data(dipper,groups="sex")
dipper.ddl=make.design.data(dipper.proc)
df=data.frame(group=c(rep("Female",6),rep("Male",6)),time=rep(c(2:7),2),
  effort=c(10,5,2,8,1,2,3,20,10,4,16,2))
merge_design.covariates(dipper.ddl$p,df,bygroup=TRUE)
Compute model averaged estimates
Description
A generic function to compute model averaged estimates and their standard errors or variance-covariance matrix.
Usage
model.average(x, ...)
Arguments
| x | is either a list with a prescribed structure as defined in
 | 
| ... | additional arguments passed to specific functions | 
Value
The structure of the returned value depends on which function is called.
Author(s)
Jeff Laake
See Also
model.average.marklist,model.average.list
Compute model averaged estimates of real parameters from a list structure for estimates
Description
A generic function to compute model averaged estimates and their standard errors or variance-covariance matrix
Usage
## S3 method for class 'list'
model.average(x, revised=TRUE, mata=FALSE, normal.lm=FALSE, 
                                       residual.dfs=0, alpha=0.025,...)
Arguments
| x | a list containing the following elements: 1)  | 
| revised | if TRUE it uses eq 6.12 from Burnham and Anderson (2002) for model averaged se; otherwise it uses eq 4.9 | 
| mata | if TRUE, create model averaged tail area confidence intervals as described by Turek and Fletcher | 
| normal.lm | Specify normal.lm=TRUE for the normal linear model case, and normal.lm=FALSE otherwise. When normal.lm=TRUE, the argument 'residual.dfs' must also be supplied. See USAGE section, and Turek and Fletcher (2012) for additional details. | 
| residual.dfs | A vector containing the residual (error) degrees of freedom under each candidate model. This argument must be provided when the argument normal.lm=TRUE. | 
| alpha | The desired lower and upper error rate. Specifying alpha=0.025 corresponds to a 95 alpha=0.05 to a 90 Default value is alpha=0.025. | 
| ... | additional arguments passed to specific functions | 
Details
If a single estimate is being model-averaged then estimate and
se are vectors with an entry for each model.  However, if there are
several estimates being averaged then both estimate and se
should be matrices in which the estimates for each model are a row in the
matrix.  Regardless, if vcv is specified it should be a list of
matrices and in the case of a single estimate, each matrix is 1x1 containing
the estimated sample-variance but that would be rather useless and se
should be used instead.
If the list contains an element named AIC,AICc,QAIC, or
QAICc, then the minimum value is computed and subtracted to compute
delta values relative to the minimum. These are then converted to Akaike
weights which are exp(-.5*delta) and these are normalized to sum to
1.  If the list does not contain one of the above values then it should have
a variable named weight.  It is normalized to 1.  The model-averaged
estimates are computed using equation 4.1 of Burnham and Anderson (2002).
If the contains a matrix named vcv, then a model-averaged
variance-covariance matrix is computed using formulae given on page 163 of
Burnham and Anderson (2002).  If there is no element named vcv then
there must be an element se which contains the model-specific
estimates of the standard errors.  The unconditional standard error for the
model-averaged estimates is computed using equation 4.9 of Burnham and
Anderson (2002) if if revised=FALSE; otherwise it uses eq 6.12.
Value
A list containing elements:
| estimate | vector of model-averaged estimates | 
| se | vector of unconditional standard errors (square root of unconditional variance estimator) | 
| vcv | model-averaged
variance-covariance matrix if  | 
| lcl | lower confidence interval if mata=TRUE | 
| ucl | upper confidence interval if mata=TRUE | 
Author(s)
Jeff Laake
References
BURNHAM, K. P., AND D. R. ANDERSON. 2002. Model selection and multimodel inference. A practical information-theoretic approach. Springer, New York. Turek, D. and Fletcher, D. (2012). Model-Averaged Wald Confidence Intervals. Computational Statistics and Data Analysis, 56(9), p.2809-2815.
See Also
Examples
# This example is excluded from testing to reduce package check time
# Create a set of models from dipper data
data(dipper)
run.dipper=function()
{
dipper$nsex=as.numeric(dipper$sex)-1
mod1=mark(dipper,groups="sex",
   model.parameters=list(Phi=list(formula=~sex)),delete=TRUE)
mod2=mark(dipper,groups="sex",
   model.parameters=list(Phi=list(formula=~1)),delete=TRUE)
mod3=mark(dipper,groups="sex",
   model.parameters=list(p=list(formula=~time),
   Phi=list(formula=~1)),delete=TRUE)
dipper.list=collect.models()
return(dipper.list)
}
dipper.results=run.dipper()
# Extract indices for first year survival from 
#  Females (group 1) and Males (group 2)
Phi.indices=extract.indices(dipper.results[[1]],
   "Phi",df=data.frame(group=c(1,2),row=c(1,1),col=c(1,1)))
# Create a matrix for estimates
estimate=matrix(0,ncol=length(Phi.indices),
    nrow=nrow(dipper.results$model.table))
# Extract weights for models
weight=dipper.results$model.table$weight
# Create an empty list for vcv matrices
vcv=vector("list",length=nrow(dipper.results$model.table))
# Loop over each model in model.table for dipper.results
for (i in 1:nrow(dipper.results$model.table))
{
# The actual model number is the row number for the model.table
  model.numbers= as.numeric(row.names(dipper.results$model.table))
# For each model extract those real parameter values and their
# vcv matrix and store them
  x=covariate.predictions(dipper.results[[model.numbers[i]]],
      data=data.frame(index=Phi.indices))
  estimate[i,]=x$estimates$estimate
  vcv[[i]]=x$vcv
}
# Call model.average using the list structure which includes 
#  estimate, weight and vcv list in this case
model.average(list(estimate=estimate,weight=weight,vcv=vcv))
#
# Now get same model averaged estimates using model.average.marklist
# Obviously this is a much easier approach and what would be used 
# if all you are doing is model averaging real parameters in the model.  
# The other form is more useful for model averaging
# functions of estimates of the real parameters (eg population estimate)
#
mavg=model.average(dipper.results,"Phi",vcv=TRUE)
print(mavg$estimates[Phi.indices,])
print(mavg$vcv.real[Phi.indices,Phi.indices])
Compute model averaged estimates of real parameters
Description
Computes model averaged estimates and standard errors of real parameters for
a list of models with a model.table constructed from
collect.models. It can also optionally compute the var-cov
matrix of the averaged parameters and their confidence intervals by
transforming with the link functions, setting normal confidence intervals on
the transformed values and then back-transforming for the real estimates.
Usage
## S3 method for class 'marklist'
model.average(x, parameter, data, vcv, drop=TRUE, indices=NULL, revised=TRUE, mata=FALSE,
        normal.lm=FALSE, residual.dfs=0, alpha=0.025,...)
Arguments
| x | a list of mark model results and a model.table constructed by
 | 
| parameter | name of model parameter (e.g., "Phi" for CJS models); if left NULL all real parameters are averaged | 
| data | dataframe with covariate values that are averaged for estimates | 
| vcv | logical; if TRUE then the var-cov matrix and confidence intervals are computed | 
| drop | if TRUE, models with any non-positive variance for betas are dropped | 
| indices | a vector of parameter indices from the all-different PIM formulation of the parameter estimates that should be presented. This argument only works if the parameter argument = NULL. The primary purpose of the argument is to trim the list of parameters in computing a vcv matrix of the real parameters which can get too big to be computed with the available memory | 
| revised | if TRUE, uses revised variance formula (eq 6.12 from Burnham and Anderson) for model averaged estimates and eq 6.11 when FALSE | 
| mata | if TRUE, create model averaged tail area confidence intervals as described by Turek and Fletcher | 
| normal.lm | Specify normal.lm=TRUE for the normal linear model case, and normal.lm=FALSE otherwise. When normal.lm=TRUE, the argument 'residual.dfs' must also be supplied. See USAGE section, and Turek and Fletcher (2012) for additional details. | 
| residual.dfs | A vector containing the residual (error) degrees of freedom under each candidate model. This argument must be provided when the argument normal.lm=TRUE. | 
| alpha | The desired lower and upper error rate. Specifying alpha=0.025 corresponds to a 95 alpha=0.05 to a 90 Default value is alpha=0.025. | 
| ... | additional arguments passed to specific functions | 
Details
If there are any models in the model.list which do not have any
output or results they are dropped.  If any have non-positive variances for
the betas and drop=TRUE, then the model is reported and dropped from
the model averaging.  The weights are renormalized for the remaining models
that are not dropped before they are averaged.
If parameter=NULL, all real parameters are model averaged but the
design data is not copied over because it can vary by the type of parameter.
It is only necessary to model average all parameters at once to get
covariances of model averaged parameters of differing types.
If data=NULL, the average covariate values are used for any models
using covariates. Note that this will only work with models created after
v1.5.0 such that average covariate values are stored in each model object.
Value
If vcv=FALSE, the return value is a dataframe of model averaged estimates and standard errors for a particular type of real parameter (e.g., Phi). The design data are appended to the dataframe to enable subsettting of the estimates based on features of the design data such as age, time, cohort and grouping variables.
If vcv=TRUE, confidence interval (lcl,ucl) limits are added to the dataframe which is contained in a list with the var-cov matrix.
Author(s)
Jeff Laake
References
Burnham, K. P. and D. R. Anderson. 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Second edition. Springer, New York.
See Also
collect.models, covariate.predictions,
model.table, compute.links.from.reals,
model.average.list
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
run.dipper=function()
{
#
# Process data
#
dipper.processed=process.data(dipper,groups=("sex"))
#
# Create default design data
#
dipper.ddl=make.design.data(dipper.processed)
#
# Add Flood covariates for Phi and p that have different values
#
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1
#
#  Define range of models for Phi
#
Phi.dot=list(formula=~1)
Phi.time=list(formula=~time)
Phi.sex=list(formula=~sex)
Phi.sextime=list(formula=~sex+time)
Phi.sex.time=list(formula=~sex*time)
Phi.Flood=list(formula=~Flood)
#
#  Define range of models for p
#
p.dot=list(formula=~1)
p.time=list(formula=~time)
p.sex=list(formula=~sex)
p.sextime=list(formula=~sex+time)
p.sex.time=list(formula=~sex*time)
p.Flood=list(formula=~Flood)
#
# Collect pairings of models
#
cml=create.model.list("CJS")
#
# Run and return the list of models
#
return(mark.wrapper(cml,data=dipper.processed,ddl=dipper.ddl,delete=TRUE))
}
dipper.results=run.dipper()
Phi.estimates=model.average(dipper.results,"Phi",vcv=TRUE)
p.estimates=model.average(dipper.results,"p",vcv=TRUE)
run.dipper=function()
{
data(dipper)
dipper$nsex=as.numeric(dipper$sex)-1
#NOTE:  This generates random valules for the weights so the answers using
#  ~weight will vary
dipper$weight=rnorm(294)  
mod1=mark(dipper,groups="sex",
  model.parameters=list(Phi=list(formula=~sex+weight)),delete=TRUE)
mod2=mark(dipper,groups="sex",
  model.parameters=list(Phi=list(formula=~sex)),delete=TRUE)
mod3=mark(dipper,groups="sex",
  model.parameters=list(Phi=list(formula=~weight)),delete=TRUE)
mod4=mark(dipper,groups="sex",
  model.parameters=list(Phi=list(formula=~1)),delete=TRUE)
dipper.list=collect.models()
return(dipper.list)
}
dipper.results=run.dipper()
real.averages=model.average(dipper.results,vcv=TRUE)
# get model averaged estimates for all parameters and use average 
# covariate values in models with covariates
real.averages$estimates 
# get model averaged estimates for Phi using a value of 2 for weight
model.average(dipper.results,"Phi", 
  data=data.frame(weight=2),vcv=FALSE)  
# what you can't do yet is use different covariate values for 
# different groups to get covariances of estimates based on different
# covariate values; for example, you can get average survival of females 
# at average female weight and average survival of males at average 
# male weight in separate calls to model.average but not in the same call 
# to get covariances; however, if you standardized weight by group 
# (ie stdwt = weight - groupmean) then using 0 for the covariate value would give
# the model averaged Phi by group at the average group weights and its 
# covariance. You can do the above for
# a single model with find.covariates/fill.covariates.
# get model averaged estimates of first Phi(1) and first p(43) and v-c matrix
model.average(dipper.results,vcv=TRUE,indices=c(1,43))  
Create table of MARK model selection results
Description
Constructs a table of model selection results for MARK analyses. The table includes the formulas, model name, number of parameters, deviance, AICc, DeltaAICc, model weight and residual deviance. If chat>1 QAICc, QDeltaAICc and QDeviance are used instead.
Usage
model.table(
  model.list = NULL,
  type = NULL,
  sort = TRUE,
  adjust = TRUE,
  ignore = TRUE,
  pf = 1,
  use.lnl = FALSE,
  use.AIC = FALSE,
  model.name = TRUE
)
Arguments
| model.list | a vector of model names or a list created by the function
 | 
| type | type of model (eg "CJS") | 
| sort | if true sorts models by criterion | 
| adjust | if TRUE adjusts # of parameters to # of cols in design matrix | 
| ignore | if TRUE collects all models and ignores that they are from different models | 
| pf | parent frame value; default=1 so it looks in calling frame of model.table; used in other functions with pf=2 when functions are nested two-deep | 
| use.lnl | display -2lnl instead of deviance | 
| use.AIC | use AIC instead of AICc | 
| model.name | if TRUE uses the model.name in each mark object which uses formula notation. If FALSE it uses the R names for the model obtained from collect.model.names or names assigned to marklist elements | 
Details
This function is used by collect.models to construct a table
of model selection results with the models that it collects; however it can
be called directly to construct the table.
Value
result.table - dataframe containing summary of models
| model.name | name of fitted model | 
| parameter.name - an entry for
each parameter | formula for parameter | 
| npar | number of estimated parameters | 
| AICc or QAICc | AICc value or QAICc if chat>1 | 
| DeltaAICc or DeltaQAICc | difference between AICc or QAICc value from model with smallest value | 
| weight | model weight based on exp(-.5*DeltaAICc) or exp(-.5*QDeltaAICc) | 
| Deviance or
QDeviance | residual deviance from saturated model | 
| chat | overdispersion constant if not 1 | 
Author(s)
Jeff Laake
See Also
collect.model.names, collect.models
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
run.dipper=function()
{
#
# Process data
#
dipper.processed=process.data(dipper,groups=("sex"))
#
# Create default design data
#
dipper.ddl=make.design.data(dipper.processed)
#
# Add Flood covariates for Phi and p that have different values
#
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1
#
#  Define range of models for Phi
#
Phi.dot=list(formula=~1)
Phi.time=list(formula=~time)
Phi.sex=list(formula=~sex)
Phi.sextime=list(formula=~sex+time)
Phi.sex.time=list(formula=~sex*time)
Phi.Flood=list(formula=~Flood)
#
#  Define range of models for p
#
p.dot=list(formula=~1)
p.time=list(formula=~time)
p.sex=list(formula=~sex)
p.sextime=list(formula=~sex+time)
p.sex.time=list(formula=~sex*time)
p.Flood=list(formula=~Flood)
#
# Return model table and list of models
#
cml=create.model.list("CJS")
return(mark.wrapper(cml,data=dipper.processed,ddl=dipper.ddl,delete=TRUE))
}
dipper.results=run.dipper()
dipper.results
dipper.results$model.table=model.table(dipper.results,model.name=FALSE)
dipper.results
#
# Compute matrices of model weights, number of parameters and Delta AICc values
#
model.weight.matrix=tapply(dipper.results$model.table$weight,
 list(dipper.results$model.table$Phi,dipper.results$model.table$p),mean)
model.npar.matrix=tapply(dipper.results$model.table$npar,
 list(dipper.results$model.table$Phi,dipper.results$model.table$p),mean)
model.DeltaAICc.matrix=tapply(dipper.results$model.table$DeltaAICc,
 list(dipper.results$model.table$p,dipper.results$model.table$Phi),mean)
#
# Output DeltaAICc as a tab-delimited text file that can be read into Excel 
# (to do that directly use RODBC or xlsreadwrite package for R)
#
# remove # to use next line
#write.table(model.DeltaAICc.matrix,"DipperDeltaAICc.txt",sep="\t")
Multistrata example data
Description
An example data set which appears to be simulated data that accompanies MARK as an example analysis using the Multistrata model.
Format
A data frame with 255 observations on the following 2 variables.
- ch
- a character vector containing the encounter history of each bird with strata 
- freq
- the number of birds with that capture history 
Details
This is a data set that accompanies program MARK as an example for the
Multistrata model. The models created by RMark are all "Parm-specific"
models by default. The sin link is not allowed because all models are
specified via the design matrix. Although you can set links for the
parameters, usually the default values are preferable.  See
make.mark.model for additional help building formula for Psi
using the remove.intercept argument.
Examples
 
# This example is excluded from testing to reduce package check time
data(mstrata)
run.mstrata=function()
{
#
# Process data
#
mstrata.processed=process.data(mstrata,model="Multistrata")
#
# Create default design data
#
mstrata.ddl=make.design.data(mstrata.processed)
#
#  Define range of models for S; note that the betas will differ from the output
#  in MARK for the ~stratum = S(s) because the design matrix is defined using
#  treatment contrasts for factors so the intercept is stratum A and the other
#  two estimates represent the amount that survival for B abd C differ from A.
#  You can use force the approach used in MARK with the formula ~-1+stratum which
#  creates 3 separate Betas - one for A,B and C.
#
S.stratum=list(formula=~stratum)
S.stratumxtime=list(formula=~stratum*time)
#
#  Define range of models for p
#
p.stratum=list(formula=~stratum)
#
#  Define range of models for Psi; what is denoted as s for Psi
#  in the Mark example for Psi is accomplished by -1+stratum:tostratum which
#  nests tostratum within stratum.  Likewise, to get s*t as noted in MARK you
#  want ~-1+stratum:tostratum:time with time nested in tostratum nested in
#  stratum.
#
Psi.s=list(formula=~-1+stratum:tostratum)
#
# Create model list and run assortment of models
#
model.list=create.model.list("Multistrata")
#
# Add on specific model that is paired with fixed p's to remove confounding
#
p.stratumxtime=list(formula=~stratum*time)
p.stratumxtime.fixed=list(formula=~stratum*time,fixed=list(time=4,value=1))
model.list=rbind(model.list,c(S="S.stratumxtime",p="p.stratumxtime.fixed",
  Psi="Psi.s"))
#
# Run the list of models
#
mstrata.results=mark.wrapper(model.list,data=mstrata.processed,ddl=mstrata.ddl,threads=2,
                            delete=TRUE)
#
# Return model table and list of models
#
return(mstrata.results)
}
mstrata.results=run.mstrata()
mstrata.results
# Example of reverse Multistratum model
#data(mstrata)
#mod=mark(mstrata,model="Multistrata",delete=TRUE)
#mod.rev=mark(mstrata,model="Multistrata",reverse=TRUE,delete=TRUE)
#Psilist=get.real(mod,"Psi",vcv=TRUE)
#Psilist.rev=get.real(mod.rev,"Psi",vcv=TRUE)
#Psivalues=Psilist$estimates
#Psivalues.rev=Psilist.rev$estimates
#TransitionMatrix(Psivalues[Psivalues$time==1,])
#TransitionMatrix(Psivalues.rev[Psivalues.rev$occ==1,])
Computes some derived abundance estimates for POPAN models
Description
Computes estimates, standard errors, confidence intervals and var-cov matrix
for population size of each group at each occasion and the sum across groups
by occasion for POPAN models. If a marklist is provided the estimates
are model averaged.
Usage
popan.derived(x,model,revised=TRUE,normal=TRUE,N=TRUE,NGross=TRUE,drop=FALSE)
 
popan.Nt(Phi,pent,Ns,vc,time.intervals) 
 
popan.NGross(Phi,pent,Ns,vc,time.intervals)
Arguments
| x | processed data list resulting from  | 
| model | a single mark POPAN model or a  | 
| revised | if TRUE, uses revised version of model averaged standard error eq 6.12; otherwise uses eq 4.9 of Burnham and Anderson (2002) | 
| normal | if TRUE, uses confidence interval based on normal distribution; otherwise, uses log-normal | 
| N | if TRUE, will return abundance estimates by group and occasion and total by occasion | 
| NGross | if TRUE, will return gross abundance estimate per group | 
| drop | if TRUE, models with any non-positive variance for betas are dropped | 
| Phi | interval-specific survival estimates for each group | 
| pent | occasion-specific prob of entry estimates (first computed by subtraction) for each group | 
| Ns | group specific super-population estimate | 
| vc | variance-covariance matrix of the real parameters | 
| time.intervals | vector of time interval values | 
Details
popan.derived computes all of the real parameters using
covariate.predictions and handles all of the computation using
popan.Nt.  Description for functions popan.Nt and
popan.NGross are given here for completeness but it is not intended
that they be called directly.
If a model is a marklist of models, the values returned by
popan.derived are model averaged using model weights in the
model.table; otherwise, it returns the values for the specified
model.
Value
popan.derived returns a list with the following elements
depending on the values of N and NGross: 
N - dataframe of estimates by group and occasion and se, lcl,ucl and group/occasion data N.vcv - variance-covariance matrix of abundance estimates in N Nbyocc - dataframe of estimates by occasion (summed across groups) and se, lcl,ucl and occasion data Nbyocc.vcv - variance-covariance matrix of abundance estimates in Nbyocc NGross - dataframe of gross abundance estimates by group and se, lcl,and ucl NGross.vcv - variance-covariance matrix of NGross abundance estimates
popan.Nt returns a list with the following elements: 
N - dataframe of estimates by group and occasion and se, lcl,ucl and group/occasion data N.vcv - variance-covariance matrix of abundance estimates in N
popan.NGross returns a list with the following elements:
NGross - vector of gross abundance estimates by group vcv - variance-covariance matrix of abundance estimates in NGross
Author(s)
Jeff Laake
References
BURNHAM, K. P., AND D. R. ANDERSON. 2002. Model selection and multimodel inference. A practical information-theoretic approach. Springer, New York.
Examples
# This example is excluded from testing to reduce package check time
# Example
data(dipper)
dipper.processed=process.data(dipper,model="POPAN",groups="sex")
run.dipper.popan=function()
{
dipper.ddl=make.design.data(dipper.processed)
Phidot=list(formula=~1)
Phitime=list(formula=~time)
pdot=list(formula=~1)
ptime=list(formula=~time)
pentsex.time=list(formula=~time)
Nsex=list(formula=~sex)
#
# Run assortment of models
#
dipper.phisex.time.psex.time.pentsex.time=mark(dipper.processed,
     dipper.ddl,model.parameters=list(Phi=Phidot,p=ptime,
     pent=pentsex.time,N=Nsex),invisible=FALSE,adjust=FALSE,delete=TRUE)
dipper.psex.time.pentsex.time=mark(dipper.processed,dipper.ddl,
     model.parameters=list(Phi=Phitime,p=pdot,
     pent=pentsex.time,N=Nsex),invisible=FALSE,adjust=FALSE,delete=TRUE)
#
# Return model table and list of models
#
return(collect.models() )
}
dipper.popan.results=run.dipper.popan()
popan.derived(dipper.processed,dipper.popan.results)
Compute estimates of real parameters with individual and design covariates
Description
Unlike covariate.predictions this function allows modification of design data covariates as well as individual covariates for the computation of the real parameters. It does this by modifying values in the design data, adding individual covariate values to the design data and then applying the model formula to the modified design data to construct a design matrix that is used with the beta estimates to construct the real parameter estimates.
Usage
predict_real(
  model,
  df,
  parameter,
  replicate = FALSE,
  beta = NULL,
  data = NULL,
  se = TRUE,
  vcv = FALSE
)
Arguments
| model | MARK model object | 
| df | design dataframe subset | 
| parameter | names of the parameter | 
| replicate | if FALSE then number of rows in data must match the number of rows in df | 
| beta | estimates of beta parameters for real parameter computation | 
| data | dataframe with covariate values | 
| se | if TRUE returns std errors and confidence interval of real estimates | 
| vcv | logical; if TRUE, sets se=TRUE and returns v-c matrix of real estimates | 
Details
There are two restrictions. First,it only works with a single parameter type (eg Phi or p) whereas covariate.predictions allows simultaneous estimation of multiple parameter types. Second, if the row from df is a fixed parameter that will only be identified if it uses the fix parameter in the design data. It will not work with the original approach of using index and fixed as arguments. This should be a minor restrictioin because it doesn't make sense to compute a range of real parameter values for a fixed parameter.
The primary arguments are model which is a fitted mark model object and df which is one of the design dataframes from the design data list (ddl) used to fit the model. The value of df can be a subset of the original design dataframe but do NOT create an arbitrary dataframe for use as df because if you don't get the indices correct or you don't construct the factor variables correctly, it will likely fail or the results could be bogus.
The argument parameter is just the character name for the parameter ("Phi"). It is only used to extract the correct formula from the fitted model.
The argument beta can provide values for the beta coefficients other than the fitted ones but if you do so, don't expect the variances to make sense as the beta.vcv is computed at the fitted values. The default for beta is to use the fitted values so it need not be specified.
The argument data is a dataframe containing values for individual covariates or replacement values for design data covariates. Do not replace values of factor variables which probably would not make sense anyhow. If replicate=FALSE then the number of rows in df must match the number of rows in data and the values in df are replaced (if design covariate) or added if an individual covariate. If replicate=TRUE, then design data (df) is replicated for each row in data and the values in data are computed for each one of the rows in df. Using replicate=TRUE would make sense in a case where the value can differ for the index. For example, if one of the design covariates was temperature then you might want to compute the values at a range of temperatures for a row in the design data representing each age class in the data.
The arguments se and vcv control computation of the std errors and v-c matrix.
Value
A data frame (real) is returned if vcv=FALSE;
otherwise, a list is returned also containing vcv.real: 
| real | data frame containing estimates, and if se=TRUE or vcv=TRUE it also contains standard errors and confidence intervals and notation of whether parameters are fixed or at a boundary | 
| vcv.real | variance-covariance matrix of real estimates | 
Author(s)
Jeff Laake
See Also
inverse.link,deriv_inverse.link
Examples
data(dipper)
dp=process.data(dipper)
ddl=make.design.data(dp)
model=mark(dp,ddl,model.parameters=list(Phi=list(formula=~Time)),delete=TRUE)
predict_real(model,ddl$Phi[1,,drop=FALSE],"Phi",replicate=TRUE,data=data.frame(Time=-12:12))
Print MARK objects
Description
Displays MARK output file or input file with MarkViewer (notepad.exe by default) so it can be viewed.
Usage
## S3 method for class 'mark'
print(x,...,input=FALSE)
Arguments
| x | mark model object; or list of mark model objects created with
 | 
| ... | additional non-specified argument for S3 generic function | 
| input | if TRUE, prints mark input file; otherwise the output file | 
Details
If the model has been run (model$output exists) the output file
stored in the directory as identified by the basefile name
(model$output) and the suffix ".out" is displayed with a call to
MarkViewer. If input is set to TRUE then the MARK input
file is displayed instead. By default the MarkViewer is notepad but
any program can be used in its place that accepts the filename as the first
argument. For example setting MarkViewer="wp" will use wordperfect
(wp.exe) as long as wp.exe is in the search path.  MarkViewer must be
set during each R session, so it is best to include it in your .First
function to change it permanently.  Since print.mark is the generic
function to print mark objects you can use it by just typing the name
of a mark object at the R prompt and it will call print.mark.
For example, if mod is a mark object then typing mod is
the same as print.mark(mod)
Value
None
Author(s)
Jeff Laake
See Also
Print MARK list objects
Description
Displays the model.table if it exists. To display the
output for a mark model contained in a list, simply type the list value
(e.g., typing mymarklist[[2]] will display output for the second model).
The function print.marklist was created to avoid accidental typing of
the model list which would call print.mark for each of the models.
Usage
## S3 method for class 'marklist'
print(x,...)
Arguments
| x | mark model object; or list of mark model objects created with
 | 
| ... | additional non-specified argument for S3 generic function | 
Details
If the model has been run (model$output exists) the output file
stored in the directory as identified by the basefile name
(model$output) and the suffix ".out" is displayed with a call to
MarkViewer. If input is set to TRUE then the MARK input
file is displayed instead. By default the MarkViewer is notepad but
any program can be used in its place that accepts the filename as the first
argument. For example setting MarkViewer="wp" will use wordperfect
(wp.exe) as long as wp.exe is in the search path.  MarkViewer must be
set during each R session, so it is best to include it in your .First
function to change it permanently.  Since print.mark is the generic
function to print mark objects you can use it by just typing the name
of a mark object at the R prompt and it will call print.mark.
For example, if mod is a mark object then typing mod is
the same as print.mark(mod)
Value
None
Author(s)
Jeff Laake
See Also
Prints summary of MARK model parameters and results
Description
Prints summary of MARK model parameters and results
Usage
## S3 method for class 'summary.mark'
print(x,...)
Arguments
| x | list resulting from call to  | 
| ... | additional non-specified argument for S3 generic function | 
Value
None
Author(s)
Jeff Laake
See Also
summary.mark
Process release-recapture history data
Description
Creates needed constructs from the release-recapture history.
Usage
process.ch(ch, freq = NULL, all = FALSE)
Arguments
| ch | Vector of character strings; each character string is composed of either a constant length sequence of single characters (01001) or the character string can be comma separated if more than a single character is used (1S,1W,0,2W). If comma separated, each string must contain a constant number of elements. | 
| freq | Optional vector of frequencies for ch; if missing assumed to be a; if <0 indicates a loss on capture | 
| all | FALSE is okay for cjs unless R code used to compute lnl instead of FORTRAN; must be true for js because it returns additional quantities needed for entry prob. | 
Value
| nocc | number of capture occasions | 
| freq | absolute value of frequency for each ch | 
| first | vector of occasion numbers for first 1 | 
| last | vector of occasion numbers for last 1 | 
| loc | vector of indicators of a loss on capture if set to 1 | 
| chmat | capture history matrix | 
| FtoL | 1's from first (1) to last (1) and 0's elsewhere; only if all==TRUE | 
| Fplus | 1's from occasion after first (1) to nocc(last occasion); only if all==TRUE | 
| Lplus | 1's from occasion after last (1) to nocc; only if all==TRUE | 
| L | 1's from last (1) to nocc; only if all==TRUE | 
Author(s)
Jeff Laake
Process encounter history dataframe for MARK analysis
Description
Prior to analyzing the data, this function initializes several variables (e.g., number of capture occasions, time intervals) that are often specific to the capture-recapture model being fitted to the data. It also is used to 1) define groups in the data that represent different levels of one or more factor covariates (e.g., sex), 2) define time intervals between capture occasions (if not 1), and 3) create an age structure for the data, if any.
Usage
process.data(
  data,
  begin.time = 1,
  model = "CJS",
  mixtures = 1,
  groups = NULL,
  allgroups = FALSE,
  age.var = NULL,
  initial.ages = c(0),
  age.unit = 1,
  time.intervals = NULL,
  nocc = NULL,
  strata.labels = NULL,
  counts = NULL,
  reverse = FALSE,
  areas = NULL,
  events = NULL
)
Arguments
| data | A data frame with at least one field named  | 
| begin.time | Time of first capture occasion or vector of times if different for each group | 
| model | Type of analysis model. See  | 
| mixtures | Number of mixtures in closed capture models with heterogeneity or number of secondary samples for MultScalOcc/RDMultScalOcc model | 
| groups | Vector of factor variable names (in double quotes) in
 | 
| allgroups | Logical variable; if TRUE, all groups are created from
factors defined in  | 
| age.var | An index in vector  | 
| initial.ages | A vector of initial ages that contains a value for each
level of the age variable  | 
| age.unit | Increment of age for each increment of time as defined by
 | 
| time.intervals | Vector of lengths of time between capture occasions | 
| nocc | number of occasions for Nest type; either nocc or time.intervals must be specified | 
| strata.labels | vector of single character values used in capture history(ch) for ORDMS, CRDMS, RDMSOccRepro, HidMarkov models; it can contain more values beyond what is in ch for unobservable states except for RDMSOccRepro which is used to specify strata ordering (eg 0 not-occupied, 1 occupied no repro, 2 occupied with repro. | 
| counts | named list of numeric vectors (one group) or matrices (>1 group) containing counts for mark-resight models | 
| reverse | if set to TRUE, will reverse timing of transition (Psi) and survival (S) in Multistratum models | 
| areas | values of areas (1 per group) for Densitypc set of models | 
| events | vector of character events for Hidden Markov models | 
Details
For examples of data, see
dipper,edwards.eberhardt,example.data.
The structure of the encounter history and the analysis depends on the
analysis model to some extent. Thus, it is necessary to process a dataframe
with the encounter history (ch) and a chosen model to define
the relevant values.  For example, number of capture occasions (nocc)
is automatically computed based on the length of the encounter history
(ch) in data; however, this is dependent on the type of
analysis model.  For models such as "CJS", "Pradel" and others, it is simply
the length of ch.  Whereas, for "Burnham" and "Barker" models,the
encounter history contains both capture and resight/recovery values so
nocc is one-half the length of ch. Likewise, the number of
time.intervals depends on the model.  For models, such as "CJS",
"Pradel" and others, the number of time.intervals is nocc-1;
whereas, for capture&recovery(resight) models the number of
time.intervals is nocc. The default time interval is unit time
(1) and if this is adequate, the function will assign the appropriate
length.  A processed data frame can only be analyzed using the model that
was specified.  The model value is used by the functions
make.design.data, add.design.data, and
make.mark.model to define the model structure as it relates to
the data. Thus, if the data are going to be analysed with different
underlying models, create different processed data sets with the model name
as an extension.  For example, dipper.cjs=process.data(dipper) and
dipper.popan=process.data(dipper,model="POPAN").
This function will report inconsistencies in the lengths of the capture history values and when invalid entries are given in the capture history. For example, with the "CJS" model, the capture history should only contain 0 and 1 whereas for "Barker" it can contain 0,1,2. For "Multistrata" models, the code will automatically identify the number of strata and strata labels based on the unique alphabetic codes used in the capture histories.
The argument begin.time specifies the time for the first capture
occasion.  This is used in creating the levels of the time factor variable
in the design data and for labelling parameters. If the begin.time
varies by group, enter a vector of times with one for each group. Note that
the time values for survivals are based on the beginning of the survival
interval and capture probabilities are labeled based on the time of the
capture occasion.  Likewise, age labels for survival are the ages at the
beginning times of the intervals and for capture probabilities it is the age
at the time of capture/recapture.
groups is a vector of variable names that are contained in
data.  Each must be a factor variable. A group is created for each
unique combination of the levels of the factor variables.  In the first
example given below groups=c("sex","age","region"). which creates
groups defined by the levels of sex, age and region.
There should be 2(sexes)*3(ages)*4(regions)=24 groups but in actuality there
are only 16 in the data because there are only 2 age groups for each sex.
Age group 1 and 2 for M and age groups 2 and 3 for F.  This was done to
demonstrate that the code will only use groups that have 1 or more capture
histories unless allgroups=TRUE.
The argument age.var=2 specifies that the second grouping variable in
groups represents an age variable.  It could have been named
something different than age. If a variable in groups is named
age then it is not necessary to specify age.var.
initial.age specifies that the age at first capture of the age levels
is 0,1 and 2 while the age classes were designated as 1,2,3. The actual ages
for the age classes do not have to be sequential or ordered, but ordering
will cause less confusion.  Thus levels 1,2,3 could represent initial ages
of 0,4,6 or 6,0,4. The argument age.unit is the amount an animal ages for
each unit of time and the default is 1.  The default for initial.age
is 0 for each group, in which case, age represents time since marking
(first capture) rather than the actual age of the animal.
Value
processed.data (a list with the following elements)
| data | original raw dataframe with group factor variable added if groups were defined | 
| model | type of analysis model (eg, "CJS", "Burnham", "Barker") | 
| freq | a dataframe of frequencies (same number of rows as data, number of columns is the number of groups in the data. The column names are the group labels representing the unique groups that have one or more capture histories. | 
| nocc | number of capture occasions | 
| time.intervals | length of time intervals between capture occasions | 
| begin.time | time of first capture occasion | 
| age.unit | increment of age for each increment of time | 
| initial.ages | an initial age for
each group in the data; Note that this is not the original argument but is a
vector with the initial age for each group. In the first example below
 | 
| nstrata | number of strata in Multistrata models | 
| strata.labels | vector of alphabetic characters used to identify strata in Multistrata models | 
| group.covariates | factor covariates used to define groups | 
Author(s)
Jeff Laake
See Also
import.chdata, dipper,
edwards.eberhardt, example.data
Examples
data(example.data)
proc.example.data=process.data(data=example.data,begin.time=1980,
groups=c("sex","age","region"),
age.var=2,initial.age=c(0,1,2))
data(dipper)
dipper.process=process.data(dipper)
Reads binary file output from MARK and returns a list of the results
Description
Window and linux versions to read binary files created by MARK. This function written by Jim Hines and modified by Jeff Laake to add derived_labels, replaces read.mark.binary and read.mark.binary.linux
Usage
readMarkVcv(f = "mark001.vcv", derived_labels)
Arguments
| f | filename specification for binary output file from MARK;named here as markxxx.vcv | 
| derived_labels | vector of labels for derived parameters; NULL if no derived parameters for model | 
Value
List of estimates, se, lcl, ucl and var-cov matrices for beta, real and derived estimates
| beta | Dataframe for beta parameters containing estimates, se, lcl, ucl | 
| beta.vcv | variance-covariance matrix for beta estimates | 
| real | Dataframe for real parameters containing estimates, se, lcl, ucl | 
| real.vcv | variance-covariance matrix for real estimates | 
| derived | list of Dataframes for derived parameters (if any) containing estimates, se, lcl, ucl | 
| derived.vcv | variance-covariance matrix for derived estimates (if any) | 
Author(s)
Jim Hines, Jeff Laake
See Also
Examples
#a=readMarkVcv('~/../Downloads/mark005.vcv')
#str(a)
Runs RELEASE for goodness of fit test
Description
Creates input file for RELEASE with the specified data, runs RELEASE and extracts the summary results for TEST2 and TEST3. Output file is named Releasennn.tmp where nnn is an increasing numeric value to create a unique filename.
Usage
release.gof(data, invisible = TRUE, title = "Release-gof", view = FALSE)
Arguments
| data | processed RMark data | 
| invisible | if TRUE, RELEASE run window is hidden from view | 
| title | title for output | 
| view | if TRUE, shows release output in a viewer window | 
Value
results: a dataframe giving chi-square, degrees of freedom and P value for TEST2, TEST3 and total of tests
Author(s)
Jeff Laake
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
dipper.processed=process.data(dipper,groups=("sex"))
release.gof(dipper.processed)
file.remove("release001.out")
file.remove("mxxx.tmp")
Remove mark models from list
Description
Remove one or more mark models from a marklist
Usage
remove.mark(marklist, model.numbers)
Arguments
| marklist | an object of class "marklist" created by
 | 
| model.numbers | vector of one more model numbers to remove from the marklist | 
Value
model.list: a list of mark models and a table of model
results.
Author(s)
Jeff Laake
See Also
collect.models,merge.mark,run.models,model.table,dipper
Examples
# see example in dipper
Runs a previous MARK model with new starting values
Description
Runs a previous MARK model with new starting values but without specifying
the model parameter formulas.  This function is most useful with
mark.wrapper in which a list of models is analyzed and the set of
formulas are not specified for each model.
Usage
rerun.mark(
  model,
  data,
  ddl,
  initial,
  output = TRUE,
  title = "",
  invisible = TRUE,
  adjust = TRUE,
  se = FALSE,
  filename = NULL,
  prefix = "mark",
  default.fixed = TRUE,
  silent = FALSE,
  retry = 0,
  realvcv = FALSE,
  external = FALSE,
  threads = -1,
  ...
)
Arguments
| model | previously run MARK model | 
| data | processed dataframe used with model | 
| ddl | design data list used with model | 
| initial | vector of initial values for beta parameters or previously run model object of similar structure | 
| output | If TRUE produces summary of model input and model output | 
| title | Optional title for the MARK analysis output | 
| invisible | if TRUE, exectution of MARK.EXE is hidden from view | 
| adjust | if TRUE, adjusts number of parameters (npar) to number of columns in design matrix, modifies AIC and records both | 
| se | if TRUE, se and confidence intervals are shown in summary sent to screen | 
| filename | base filename for files created by MARK.EXE. Files are named filename.*. | 
| prefix | base filename prefix for files created by MARK.EXE; the files are named prefixnnn.* | 
| default.fixed | if TRUE, real parameters for which the design data have been deleted are fixed to default values | 
| silent | if TRUE, errors that are encountered are suppressed | 
| retry | number of reanalyses to perform with new starting values when one or more parameters are singular | 
| realvcv | if TRUE the vcv matrix of the real parameters is extracted and stored in the model results | 
| external | if TRUE the mark object is saved externally rather than in the workspace; the filename is kept in its place | 
| threads | number of cpus to use with mark.exe if positive or number of cpus to remain idle if negative | 
| ... | argument values like nodes etc for call to make.mark.model | 
Details
This is a simple function that restarts an analysis with MARK typically
using another model for initial values of the beta parameters.  The
processed dataframe (data) and design data list (ddl) must be
specified but the model.parameters are extracted from model.
initial values are not optional otherwise this would be no different
than the original call to mark. More complete definitions of the
arguments can be found in mark or run.mark.model
or make.mark.model.
Value
model: MARK model object with the base filename stored in
output and the extracted results from the output file appended
onto list; see mark for a detailed description of a
mark object.
Author(s)
Jeff Laake
See Also
make.mark.model, run.models,
extract.mark.output, adjust.parameter.count,
mark, cleanup
Examples
## Not run: 
# The following example will not run because the data are not included in the
# examples.  It illustrates the use of rerun.mark with mark.wrapper.  With this
# particular data set the POPAN models were having difficulty converging.  After
# running the set of models using mark.wrapper and looking at the results it
# was clear that in several instances the model did not converge.  This is easiest
# to discern by comparing nested models in the model.table.  If one model 
# is nested within another,then the deviance of the model with more
# parameters should be as good or better than the smaller model.  If that 
# is not the case then the model that converged can be used for initial 
# values in a call to rerun.mark for the model that did not converge.
#
do.nat=function()
{
Phi.ageclass=list(formula=~ageclass)
Phi.dot=list(formula=~1)
p.area=list(formula=~area)
p.timebin.plus.area=list(formula=~timebin+area)
p.timebin.x.area=list(formula=~-1+timebin:area)
pent.ageclass=list(formula=~ageclass)
pent.ageclass.plus.EN=list(formula=~ageclass+EN)
pent.ageclass.plus.diffEN=list(formula=~ageclass+EN92+EN97+EN02)
cml=create.model.list("POPAN")
nat=mark.wrapper(cml,data=zc.proc,ddl=zc.ddl,
  invisible=FALSE,initial=1,retry=2,delete=TRUE)
return(nat)
}
nat=do.nat()
# model list
#            Phi                   p                      pent
#1  Phi.ageclass              p.area             pent.ageclass
#2  Phi.ageclass              p.area pent.ageclass.plus.diffEN
#3  Phi.ageclass              p.area     pent.ageclass.plus.EN
#4  Phi.ageclass p.timebin.plus.area             pent.ageclass
#5  Phi.ageclass p.timebin.plus.area pent.ageclass.plus.diffEN
#6  Phi.ageclass p.timebin.plus.area     pent.ageclass.plus.EN
#7  Phi.ageclass    p.timebin.x.area             pent.ageclass
#8  Phi.ageclass    p.timebin.x.area pent.ageclass.plus.diffEN
#9  Phi.ageclass    p.timebin.x.area     pent.ageclass.plus.EN
#10      Phi.dot              p.area             pent.ageclass
#11      Phi.dot              p.area pent.ageclass.plus.diffEN
#12      Phi.dot              p.area     pent.ageclass.plus.EN
#13      Phi.dot p.timebin.plus.area             pent.ageclass
#14      Phi.dot p.timebin.plus.area pent.ageclass.plus.diffEN
#15      Phi.dot p.timebin.plus.area     pent.ageclass.plus.EN
#16      Phi.dot    p.timebin.x.area             pent.ageclass
#17      Phi.dot    p.timebin.x.area pent.ageclass.plus.diffEN
#18      Phi.dot    p.timebin.x.area     pent.ageclass.plus.EN
#
# use model 9 as starting values for model 7
nat[[7]]= rerun.mark(nat[[7]],data=zc.proc,ddl=zc.ddl,initial=nat[[9]],delete=TRUE)
# use model 3 as starting values for model 1
nat[[1]]= rerun.mark(nat[[1]],data=zc.proc,ddl=zc.ddl,initial=nat[[3]],delete=TRUE)
# use model 14 as starting values for model 15
nat[[15]]= rerun.mark(nat[[15]],data=zc.proc,ddl=zc.ddl,initial=nat[[14]],delete=TRUE)
# use model 5 as starting values for model 6
nat[[6]]= rerun.mark(nat[[6]],data=zc.proc,ddl=zc.ddl,initial=nat[[5]],delete=TRUE)
# use model 10 as starting values for model 11
nat[[11]]= rerun.mark(nat[[11]],data=zc.proc,ddl=zc.ddl,initial=nat[[10]],delete=TRUE)
# use model 10 as starting values for model 12
nat[[12]]= rerun.mark(nat[[12]],data=zc.proc,ddl=zc.ddl,initial=nat[[10]],delete=TRUE)
# reconstruct model table with new results
nat$model.table=model.table(nat[1:18])
# show new model table
nat
## End(Not run)
Robust design example data
Description
A robust design example data set that accompanies MARK as an example analysis using the various models for the robust design.
Format
A data frame with 668 observations on the following 2 variables.
- ch
- a character vector containing the encounter history 
- freq
- the number of critters with that capture history 
Details
This is a data set that accompanies program MARK as an example for robust
models. The data are entered with the summary format using the variable
freq which represents the number of critters with that capture
(encounter) history.  The data set represents a robust design with 5 primary
occasions and within each primary occasion the number of secondary occasions
is 2,2,4,5,2 respectively.  This is represented with the
time.intervals argument of process.data which are
0,1,0,1,0,0,0,1,0,0,0,0,1,0. The 0 time intervals represent the secondary
sessions in which the population is assumed to be closed. The non-zero
values are the time intervals between the primary occasions.  They are all 1
in this example but they can have different non-zero values.  The code
determines the structure of the robust design based on the time intervals.
The intervals must begin and end with at least one 0 and there must be at
least one 0 between any 2 non-zero elements. The number of occasions in a
secondary session is one plus the number of contiguous zeros.
Examples
# This example is excluded from testing to reduce package check time
data(robust)
run.robust=function()
{
#
# data from Robust.dbf with MARK
# 5 primary sessions with secondary sessions of length 2,2,4,5,2
#
time.intervals=c(0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0)
#
# Random emigration, p=c varies by time and session, S by time
#
S.time=list(formula=~time)
p.time.session=list(formula=~-1+session:time,share=TRUE)
GammaDoublePrime.random=list(formula=~time,share=TRUE)
model.1=mark(data = robust, model = "Robust", 
            time.intervals=time.intervals,
            model.parameters=list(S=S.time,
            GammaDoublePrime=GammaDoublePrime.random,p=p.time.session),threads=2,delete=TRUE)
#
# Random emigration, p varies by session, uses Mh but pi fixed to 1, 
# S by time.This model is in the example Robust with MARK but it is
# a silly example because it uses the heterogeneity model but then fixes 
# pi=1 which means there is no heterogeneity.Probably the data were 
# not generated under Mh.  See results of model.2.b
#
pi.fixed=list(formula=~1,fixed=1)
p.session=list(formula=~-1+session,share=TRUE)
model.2.a=mark(data = robust, model = "RDHet", 
            time.intervals=time.intervals,
            model.parameters=list(S=S.time,
            GammaDoublePrime=GammaDoublePrime.random,
            p=p.session,pi=pi.fixed),threads=2,delete=TRUE)
#
# Random emigration, p varies by session, uses Mh and in this 
# case pi varies and so does p across
# mixtures with an additive session effect.
#
pi.dot=list(formula=~1)
p.session.mixture=list(formula=~session+mixture,share=TRUE)
model.2.b=mark(data = robust, model = "RDHet", 
            time.intervals=time.intervals,
            model.parameters=list(S=S.time,
            GammaDoublePrime=GammaDoublePrime.random,
            p=p.session.mixture,pi=pi.dot),threads=2,delete=TRUE)
#
# Markov constant emigration rates, pi varies by session, 
# p=c varies by session, S constant
# This model is in the example Robust with MARK 
# but it is a silly example because it
# uses the heterogeneity model but then fixes pi=1 
# which means there is no heterogeneity.
# Probably the data were not generated under Mh.  
# See results of model.3.b
#
S.dot=list(formula=~1)
pi.session=list(formula=~session)
p.session=list(formula=~-1+session,share=TRUE)
GammaDoublePrime.dot=list(formula=~1)
GammaPrime.dot=list(formula=~1)
model.3.a=mark(data = robust, model = "RDHet", 
            time.intervals=time.intervals,
            model.parameters=list(S=S.dot,
            GammaPrime=GammaPrime.dot,
            GammaDoublePrime=GammaDoublePrime.dot,
            p=p.session,pi=pi.session),threads=2,delete=TRUE)
#
# Markov constant emigration rates, pi varies by session, 
# p=c varies by session+mixture, S constant. This is model.3.a 
# but allows pi into the model by varying p/c by mixture.
#
S.dot=list(formula=~1)
pi.session=list(formula=~session)
GammaDoublePrime.dot=list(formula=~1)
GammaPrime.dot=list(formula=~1)
model.3.b=mark(data = robust, model = "RDHet", 
            time.intervals=time.intervals,
            model.parameters=list(S=S.dot,
            GammaPrime=GammaPrime.dot,
            GammaDoublePrime=GammaDoublePrime.dot,
            p=p.session.mixture,pi=pi.session),threads=2,delete=TRUE)
#
# Huggins Random emigration, p=c varies by time and session, 
# S by time
# Beware that this model is not quite the same 
# as the others above that say random emigration because
# the rates have been fixed for the last 2 occasions.  
# That was done with PIMS in the MARK example and
# here it is done by binning the times so that times 3 and 4 
# are in the same bin, so the time model
# has 3 levels (1,2, and 3-4).  By doing so the parameters 
# become identifiable but this may not be
# reasonable depending on the particulars of the data.  
# Note that the same time binning must be done both for
# GammaPrime and GammaDoublePrime because the parameters are 
# the same in the random emigration model.  If you
# forget to bin one of the parameters across time it will fit 
# a model but it won't be what you expect as it will
# not share parameters.  Note the use of the argument "right".  
# This controls whether binning is inclusive on the right (right=TRUE) 
# or on the left (right=FALSE).  Using "right" nested in the list
# of design parameters is equivalent to using it as a calling 
# argument to make.design.data or add.design.data.
#
S.time=list(formula=~time)
p.time.session=list(formula=~-1+session:time,share=TRUE)
GammaDoublePrime.random=list(formula=~time,share=TRUE)
model.4=mark(data = robust, model = "RDHuggins", 
        time.intervals=time.intervals,design.parameters=
        list(GammaDoublePrime=list(time.bins=c(1,2,5))),
        right=FALSE, model.parameters=
        list(S=S.time,GammaDoublePrime=GammaDoublePrime.random,
        p=p.time.session),threads=2,delete=TRUE)
return(collect.models())
}
robust.results=run.robust()
#
#  You will receive a warning message that the model list 
#  includes models of different types which are not compatible 
#  for comparisons of AIC.  That is because
#  the runs include closed models which include N 
#  in the likelihood and Huggins models which don't include 
#  N in the likelihood.  That can be avoided by running
#  the two types of models in different sets.
#
robust.results
Runs analysis with MARK model using MARK.EXE
Description
Passes input file from model (model$input) to MARK, runs MARK, gets
output and extracts relevant values into results which is
appended to the mark model object.
Usage
run.mark.model(
  model,
  invisible = FALSE,
  adjust = TRUE,
  filename = NULL,
  prefix = "mark",
  realvcv = FALSE,
  delete = FALSE,
  external = FALSE,
  threads = -1,
  ignore.stderr = FALSE
)
Arguments
| model | MARK model created by  | 
| invisible | if TRUE, exectution of MARK.EXE is hidden from view | 
| adjust | if TRUE, adjusts number of parameters (npar) to number of columns in design matrix, modifies AIC and records both | 
| filename | base filename for files created by MARK.EXE. Files are named filename.*. | 
| prefix | base filename prefix for files created by MARK.EXE; the files are named prefixnnn.* | 
| realvcv | if TRUE the vcv matrix of the real parameters is extracted and stored in the model results | 
| delete | if TRUE the output files are deleted after the results are extracted | 
| external | if TRUE the mark object is saved externally rather than in the workspace; the filename is kept in its place | 
| threads | number of cpus to use with mark.exe if positive or number of cpus to remain idle if negative | 
| ignore.stderr | If set TRUE, messages from mark.exe are suppressed; they are automatically suppressed with Rterm | 
Details
This is a rather simple function that initiates the analysis with MARK and
extracts the output. An analysis was split into two functions
make.mark.model and run.mark.model to allow a set of
models to be created and then run individually or collectively with
run.models.  By default, the execution of MARK.EXE will appear
in a separate window in which the progress can be monitored. The window can
be suppressed by setting the argument invisible=TRUE. The function
returns a mark object and it should be assigned to the same object to
replace the original model (e.g., mymodel=run.mark.model(mymodel)).
The element output is the base filename that links the objects to the
output files stored in the same directory as the R workspace.  To removed
unneeded output files after deleting mark objects in the workspace, see
cleanup. results is a list of specific output values
that are extracted from the output. In extracting the results, the number of
parameters can be adjusted (adjust=TRUE) to match the number of
columns in the design matrix, which assumes that it is full rank and that
all of the parameters are estimable and not confounded.  This can be useful
if that assumption is true, because on occasion MARK.EXE will report an
incorrect number of parameters in some cases in which the parameters are at
boundaries (e.g., 0 or 1 for probabilities).  If the true parameter count is
neither that reported by MARK.EXE nor the number of columns in the design
matrix, then it can be adjusted using adjust.parameter.count.
If filename is assigned a value it is used to specify files with
those names. This is most useful to capture output from a model that has
already been run.  If it finds the files with those names already exists, it
will ask if the results should be extracted from the files rather than
re-running the models.
Value
model: MARK model object with the base filename stored in
output and the extracted results from the output file appended
onto list; see mark for a detailed description of a
mark object.
Author(s)
Jeff Laake
See Also
make.mark.model, run.models,
extract.mark.output, adjust.parameter.count,
mark, cleanup
Examples
# This example is excluded from testing to reduce package check time
test=function()
{
  data(dipper)
  for(sex in unique(dipper$sex))
  {
	  x=dipper[dipper$sex==sex,]
	  x.proc=process.data(x,model="CJS")
	  x.ddl=make.design.data(x.proc)
	  Phi.dot=list(formula=~1)
	  Phi.time=list(formula=~time)
	  p.dot=list(formula=~1)
	  p.time=list(formula=~time)
	  cml=create.model.list("CJS")
	  x.results=mark.wrapper(cml,data=x.proc,ddl=x.ddl,prefix=sex,delete=TRUE)
	  assign(paste(sex,"results",sep="."),x.results)
  }
  rm(Male.results,Female.results,x.results)
}
test()
cleanup(ask=FALSE,prefix="Male")
cleanup(ask=FALSE,prefix="Female")
Runs a set of MARK models
Description
Runs either a collection of models as defined in model.list or runs
all defined MARK object models in the frame of the calling function with no
output (model.list=NULL) or just those of a particular type
(e.g., type="CJS")
Usage
run.models(model.list = NULL, type = NULL, save = TRUE, ...)
Arguments
| model.list | either a vector of model names or NULL to run all MARK
models possibly of a particular  | 
| type | either a model type (eg "CJS", "Burnham" or "Barker") or NULL for all types | 
| save | if TRUE, the R data directory is saved (i.e.,
 | 
| ... | any additional parameters to be passed to
 | 
Details
The model names in model.list must be in the frame of the function
that calls run.models. If model.list=NULL or the MARK models
are collected from the frame of the calling function (the parent). If
type is specified only the models of that type (e.g., "CJS") are run.
In each case the models are run and saved in the parent frame.
Value
None; models are stored in parent frame.
Author(s)
Jeff Laake
See Also
collect.model.names, run.mark.model
Salamander occupancy data
Description
An occupancy data set for modelling presence/absence data for salamanders.
Format
A data frame with 39 observations (sites) on the following 2 variables.
- ch
- a character vector containing the presence (1) and absence (0) for each visit to the site 
- freq
- frequency of sites (always 1) 
Details
This is a data set that accompanies program PRESENCE and is explained on page 99 of MacKenzie et al. (2006).
References
MacKenzie, D.I., Nichols, J. D., Royle, J.A., Pollock, K.H., Bailey, L.L., and Hines, J.E. 2006. Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurence. Elsevier, Inc. 324p.
Examples
# This example is excluded from testing to reduce package check time
do.salamander=function()
{
   data(salamander)
   occ.p.dot=mark(salamander,model="Occupancy",delete=TRUE)
   occ.p.time=mark(salamander,model="Occupancy",
         model.parameters=list(p=list(formula=~time)),delete=TRUE)
   occ.p.mixture=mark(salamander,model="OccupHet",
         model.parameters=list(p=list(formula=~mixture)),delete=TRUE)
   return(collect.models())
}
salamander.results=do.salamander()
print(salamander.results)
Defines model specific parameters (internal use)
Description
Compares model, the name of the type of model (eg "CJS") to the list
of acceptable models to determine if it is supported and then creates some
global fields specific to that type of model that are used to modify the
operation of the code.
Usage
setup.model(model, nocc, mixtures = 1)
Arguments
| model | name of model type (must be in vector  | 
| nocc | length of capture history string | 
| mixtures | number of mixtures | 
Details
In general, the structure of the different types of models (e.g., "CJS","Recovery",...etc) are very similar with some minor exceptions. This function is not intended to be called directly by the user but it is documented to enable other models to be added. This function is called by other functions to validate and setup model specific parameters. For example, for live/dead models, the length of the capture history is twice the number of capture occasions and the number of time intervals equals the number of capture occasions because the final interval is included with dead recoveries. Whereas, for recapture models, the length of the capture history is the number of capture occasions and the number of time intervals is 1 less than the number of occasions. This function validates that the model is valid and sets up some parameters specific to the model that are used in the code.
Value
model.list - a list with following elements
| etype | encounter type string for MARK input; typically same as model | 
| nocc | number of capture occasions | 
| num | number of time intervals relative to number of occasions (0 or -1) | 
| mixtures | number of mixtures if any | 
| derived | logical; TRUE if model produces derived estimates | 
Author(s)
Jeff Laake
See Also
setup.parameters, valid.parameters
Setup parameter structure specific to model (internal use)
Description
Defines list of parameters used in the specified type of model
(model) and adds default values for each parameter to the list of
user specified values (eg formula, link etc) defined in the call to
make.mark.model
Usage
setup.parameters(
  model,
  parameters = list(),
  nocc = NULL,
  check = FALSE,
  number.of.groups = 1
)
Arguments
| model | type of model ("CJS", "Burnham" etc) | 
| parameters | list of model parameter specifications | 
| nocc | number of occasions (value only specified if needed) | 
| check | if TRUE only the vector of parameter names is returned
 | 
| number.of.groups | number of groups defined for data | 
Details
The primary difference in setting up models for MARK is the number and types
of parameters that are included in the model.  This function sets up the
list of parameters used in the model and defines values for each parameter
that affect how the PIM and design data are structured in the input file for
program MARK.  Some of the values of the parameter list are user specified
such as formula, link,fixed so this function only adds
to the list of values that are not specified by the user.  That is, it takes
the input argument parameters and adds list elements for parameters
not specified by the user and adds default values for each type of parameter
and then returns the modified list. The structure of the argument
parameters and the return value of this function are the same as the
structure of the argument parameters in make.mark.model
and argument model.parameters in mark.  They are lists
with an element for each type of parameter in the model and the name of each
list element is the parameter name (e.g., "p", "Phi","S", etc).  For each
parameter there are a list of values (e.g., formula, link, num etc as
defined below).  Thus parameters is a list of lists.
Value
The return value depends on the argument check. If it is TRUE
then the return value is a vector of the names of the parameters used in the
specified type of model. For example, if model="CJS" then the return
value is c("Phi","p").  This is used by the function
valid.parameters to make sure that parameter specifications
are valid for the model (i.e., specifying recovery rate r for "CJS" would
give an error).  If the function is called with the default of
check=FALSE, the function returns a list of parameter specifications
which is a modification of the argument parameters which adds
parameters not specified and default values for all types of parameters that
were not specified. The list length and names of the list elements depends
on the type of model. Each element of the list is itself a list with varying
numbers of elements which depend on the type of parameter although some
elements are the same for all parameters.  Below the return value list is
shown generically with parameters named p1,...,pk.  
| p1 | List of specifications for parameter 1 | 
| p2 | List of specifications for parameter 2 | 
| . | |
| . | |
| . | |
| pk | List of specifications for parameter k | 
The elements for each parameter list all include:
| begin | 0 or 1; beginning time for the first | 
| parameter relative to first occasion | |
| num | 0 or -1; number of parameters relative to | 
| number of occassions | |
| type | type of PIM structure; either "Triang" or "Square" | 
| formula | formula for parameter
model (e.g., ~time) | 
| link | link function for parameter
(e.g., "logit") | 
and may include:
| share | only valid for p in closed capture models; | 
| if TRUE p and c models shared | |
| mix | only valid for closed capture heterogeneity | 
| models; if TRUE mixtures are used | |
| rows | only valid for closed capture heterogeneity models | 
| fixed | fixed values specified by user and | 
| not used modified in this function | |
Author(s)
Jeff Laake
See Also
An example of the Multistrata (multi-state) model in which states are routes taken by migrating fish.
Description
An example of the Multistrata (multi-state) model in which states are routes taken by migrating fish.
Author(s)
Megan Moore <megan.moore at noaa.gov>
Examples
# There are just two states which correspond to route A and route B. There are 6 occasions
# which are the locations rather than times. After release at 1=A there is no movement 
# between states for the first segment, fish are migrating downriver together and all pass 2A. 
# Then after occasion 2, migrants go down the North Fork (3A) or the South Fork (3B), 
# which both empty into Skagit Bay. Once in saltwater, they can go north to Deception Pass (4A)
# or South to a receiver array exiting South Skagit Bay (4B). Fish in route A can then only go
# to the Strait of Juan de Fuca, while fish in route B must pass by Admiralty Inlet (5B). 
# Then both routes end with the array at the Strait of Juan de Fuca.
#
#       1A
#        |
#        2A
#      /     \
#    3A        3B
#   /  \      /  \ 
# 4A   4B    4A  4B       
#  |     \    /   |
#   5A    5B  5A   5B
#      \   \   /    /
#            6
# 
# from 3A and 3B they can branch to either 4A or 4B; branches merge at 6  
# 5A does not exist so p=0; only survival from 4A to 6 can be 
# estimated which is done by setting survival from 4A to 5A to 1 and
# estimating survival from 5A to 6 which is then total survival from 4A to 6.
pathtodata=paste(path.package("RMark"),"extdata",sep="/")
skagit=import.chdata(paste(pathtodata,"skagit.txt",sep="/"),field.types=c("f"),header=TRUE)
skagit.processed=process.data(skagit,model="Multistrata",groups=c("tag"))
skagit.ddl=make.design.data(skagit.processed)
#
# p
#
# Can't be seen at 5A or 2B,6B (the latter 2 don't exist)
skagit.ddl$p$fix=ifelse((skagit.ddl$p$stratum=="A"&skagit.ddl$p$time==5) | 
 (skagit.ddl$p$stratum=="B"&skagit.ddl$p$time%in%c(2,6)),0,NA)
# Estimated externally from current data to allow estimation of survival at last interval
skagit.ddl$p$fix[skagit.ddl$p$tag=="v7"&skagit.ddl$p$time==6&skagit.ddl$p$stratum=="A"]=0.687
skagit.ddl$p$fix[skagit.ddl$p$tag=="v9"&skagit.ddl$p$time==6&skagit.ddl$p$stratum=="A"]=0.975
#
# Psi
#
# only 3 possible transitions are A to B at time interval 2 to 3 and 
# for time interval 3 to 4 from A to B and from B to A
# rest are fixed values
skagit.ddl$Psi$fix=NA
# stay in A for intervals 1-2, 4-5 and 5-6
skagit.ddl$Psi$fix[skagit.ddl$Psi$stratum=="A"&
  skagit.ddl$Psi$tostratum=="B"&skagit.ddl$Psi$time%in%c(1,4,5)]=0
# stay in B for interval 4-5
skagit.ddl$Psi$fix[skagit.ddl$Psi$stratum=="B"&skagit.ddl$Psi$tostratum=="A"
  &skagit.ddl$Psi$time==4]=0
# leave B to go to A for interval 5-6
skagit.ddl$Psi$fix[skagit.ddl$Psi$stratum=="B"&skagit.ddl$Psi$tostratum=="A"&
 skagit.ddl$Psi$time==5]=1
# "stay" in B for interval 1-2 and 2-3 because none will be in B
skagit.ddl$Psi$fix[skagit.ddl$Psi$stratum=="B"&skagit.ddl$Psi$tostratum=="A"&
 skagit.ddl$Psi$time%in%1:2]=0
# 
# S
#
# None in B, so fixing S to 1
skagit.ddl$S$fix=ifelse(skagit.ddl$S$stratum=="B"&skagit.ddl$S$time%in%c(1,2),1,NA)
skagit.ddl$S$fix[skagit.ddl$S$stratum=="A"&skagit.ddl$S$time==4]=1
# fit model
p.timexstratum.tag=list(formula=~time:stratum+tag,remove.intercept=TRUE)
Psi.sxtime=list(formula=~-1+stratum:time)
S.stratumxtime=list(formula=~-1+stratum:time)
#
S.timexstratum.p.timexstratum.Psi.sxtime=mark(skagit.processed,skagit.ddl,
 model.parameters=list(S=S.stratumxtime,p= p.timexstratum.tag,Psi=Psi.sxtime),delete=TRUE)
# calculation of cummulative survival for entire route
Sest=plogis(coef(S.timexstratum.p.timexstratum.Psi.sxtime)$estimate)
# A
prod(Sest[c(1:3,6)])
#[1] 0.1644
# B
prod(Sest[c(1,2,4,5,7)])
#[1] 0.1154
Split/collapse capture histories
Description
splitCH will split a character string vector of capture histories into a matrix. The matrix is appended to the original data set (data) if one is specified. Will handle character and numeric values in ch. Results will differ depending on content of ch. collapseCH will collapse a capture history matrix back into a character vector. Argument can either be a capture history matrix (chmat) or a dataframe (data) that contains fields with a specified prefix.
Usage
splitCH(x="ch", data=NULL, prefix="Time")
 
        collapseCH(chmat=NULL, data=NULL, prefix="Time")
Arguments
| x | A vector containing the character strings of capture histories or the column number or name in the data set  | 
| data | A data frame containing columnwith value in x if x indicates a column in a data frame | 
| prefix | first portion of field names for split ch | 
| chmat | capture history matrix | 
Value
A data frame if data specified and a matrix if vector ch is specified
Author(s)
Devin Johnson; Jeff Laake
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
# following returns a matrix
chmat=splitCH(dipper$ch)
# following returns the original dataframe with the ch split into columns
newdipper=splitCH(data=dipper)
# following collapses chmat
ch=collapseCH(chmat)
# following finds fields in newdipper and creates ch
newdipper$ch=NULL
newdipper=collapseCH(data=newdipper)
Store models externally or restore to workspace from external storage
Description
Stores/restores all mark model objects in a marklist either to or from external storage.
Usage
store(x)
Arguments
| x | marklist of models | 
Details
For store, each mark model is stored externally and the object in the
list is replaced with the filename of the object. restore does the
opposite of storing the saved external object into the marklist and then
deleting the saved file.
Value
A modified marklist to replace the previous marklist specified as the argument.
Author(s)
Jeff Laake
Strip comments
Description
Read in file and strip out comments and blank lines.
Usage
strip.comments(inp.filename, use.comments = TRUE, header = TRUE)
Arguments
| inp.filename | name of input file; inp extension is assumed and does not need to be specified | 
| use.comments | if TRUE values within /* and */ on data lines are used as row.names for the RMark dataframe. Only use this option if they are unique values. | 
| header | if TRUE, input file has header line with field names | 
Value
| rn | row names | 
| out.filename | output filename | 
Author(s)
Jeff Laake
Summary of MARK model parameters and results
Description
Creates a summary object of either a MARK model input or model output which includes number of parameters, deviance, AICc, the beta and real parameter estimates and optionally standard errors, confidence intervals and variance-covariance matrices. If there are several groups in the data, the output is structured by group.
Usage
## S3 method for class 'mark'
summary(object,...,se=FALSE,vc=FALSE,showall=TRUE,show.fixed=FALSE,brief=FALSE)
Arguments
| object | a MARK model object | 
| ... | additional non-specified argument for S3 generic function | 
| se | if FALSE the real parameter estimates are output in PIM format (eg. triangular format); if TRUE, they are displayed as a list with se and confidence interval | 
| vc | if TRUE the v-c matrix of the betas is included | 
| showall | if FALSE it only returns the values of each unique parameter value | 
| show.fixed | if FALSE, each fixed value given NA; otherwise the fixed real value is used. If se=TRUE, default for show.fixed=TRUE | 
| brief | if TRUE, does not show real parameter estimates | 
Details
The structure of the summary of the real parameters depends on the type of
model and the value of the argument se and showall.  If
se=F then only the estimates of the real parameters are shown and
they are summarized the result element reals in PIM format.  The
structure of reals depends on whether the PIMS are upper triangular
("Triang") or a row ("Square" although not really square). For the upper
triangular format, the values are passed back as a list of matrices where
the list is a list of parameter types (eg Phi and p) and within each type is
a list for each group containing the pim as an upper triangular matrix
containing the real parameter estimate.  For square matrices, reals
is a list of matrices with a list element for each parameter type, but there
is not a second list layer for groups because in the returned matrix each
group is a row in the matrix of real estimates.  If se=TRUE then
estimates, standard error (se), lower and upper confidence limits (lcl, ucl)
and a "Fixed" indicator is passed for each real parameter.  If the pims for
the model were simplified to represent the unique real parameters (unique
rows in the design matrix), then it is possible to restict the summary to
only the unique parameters with showall=FALSE.  This argument only
has an affect if se=TRUE. If showall=FALSE, reals is
returned as a dataframe of the unique real parameters specified in the
model.  This does not mean they will all have unique values and it includes
all "Fixed" real parameters and any real parameters that cannot be
simplified in the case of parameters such as "pent" in POPAN or "Psi" in
"Multistrata" that use the multinomial logit link. Use of
showall=FALSE is of limited use but provided for completeness.  In
most cases the default of showall=TRUE will be satisfactory.  In this
case, reals is a list of dataframes with a list element for each
parameter type.  The dataframe contains the estimate, se,lcl, ucl,fixed and
the associated default design data for that parameter (eg time,age, cohort
etc).  The advantage of retrieving the reals in this format is that it is
the same regardless of the model, so it enables model averaging the real
parameters over different models with differing numbers of unique real
parameters.
Value
A list with each of the summarized objects that depends on the argument values. Only the first 4 are given if it is a summary of a model that has not been run.
| model | type of model (e.g., CJS) | 
| title | user define title if any | 
| model.name | descriptive name of fitted model | 
| call | call to make.mark.model used to construct the model | 
| npar | number of fitted parameters | 
| lnl | -2xLog Likelihood value | 
| npar | Number of parameters (always the number of columns in design matrix) | 
| chat | Value of over-dispersion constant if not equal to 1 | 
| npar.unadjusted | number of estimated parameters from MARK if different than npar | 
| AICc | Small sample corrected AIC using npar; named qAICc if chat not equal to 1 | 
| AICc.unadjusted | Small sample corrected AIC using npar.unadjusted; prefix of q if chat not equal to 1 | 
| beta | dataframe of beta parameters with estimate, se, lcl, ucl | 
| vcv | variance-covariance matrix for beta | 
| reals | list of lists, dataframes or matrices depending on value of se and the type of model (triangular versus square PIMS) (see details above) | 
Author(s)
Jeff Laake
Provides a summary for the capture histories
Description
For each release (initial capture) cohort, the number of recaptured
(resighted) individuals from that cohort is tallied for each of the
following occasions.  A summary table with number released (initially
caught) and the number recaptured is given for each group if
bygroup=TRUE.
Usage
summary_ch(x, bygroup = TRUE, marray = FALSE)
Arguments
| x | Processed data list; resulting value from process.data | 
| bygroup | if TRUE, summary tables are created for each group defined in the data | 
| marray | if TRUE, summary tables are m-arrays as in MARK | 
Value
list of dataframes (one for each group in the data); each dataframe
has rows for each release cohort and columns for each recapture occasion.
The rows and columns are labelled with the occasion time labels.  If
marray==FALSE the first column is the number initially released and
the remaining columns (one for each recapture/resighting occasion) are the
number recaught in each of the following occasions and the number caught in
at least one of the occasions.  If marray==TRUE the first column is
the number released which includes those initially released and ones
released after recapture from a previous cohort. The remaining columns are
the number first recaught in each of the following occasions.  Once
re-caught they become one of the following rows (ie release-recap pairs)
unless it is the last time they were captured and they were not released (eg
negative frequency).
Author(s)
Jeff Laake
Examples
data(dipper)
dipper.processed=process.data(dipper,groups=("sex"))
summary_ch(dipper.processed)
#$sexFemale
#  Released 2  3 4  5  6  7 Total
#1       10 5  3 3  2  1  0     6
#2       29 0 11 6  6  4  2    11
#3       27 0  0 9  5  3  2     9
#4       23 0  0 0 11  7  4    13
#5       19 0  0 0  0 12  6    12
#6       23 0  0 0  0  0 11    11
#
#$sexMale
#  Released 2 3  4  5  6  7 Total
#1       12 6 3  2  1  1  0     7
#2       20 0 9  2  1  0  0     9
#3       25 0 0 13  6  2  0    14
#4       22 0 0  0 15  9  7    16
#5       22 0 0  0  0 13 10    13
#6       23 0 0  0  0  0 12    12
summary_ch(dipper.processed,marray=TRUE)
#$sexFemale
#  Released 2  3  4  5  6  7 Total
#1       10 5  1  0  0  0  0     6
#2       34 0 13  1  0  0  0    14
#3       41 0  0 17  1  0  0    18
#4       41 0  0  0 23  1  1    25
#5       43 0  0  0  0 26  0    26
#6       50 0  0  0  0  0 24    24
#
#$sexMale
#  Released 2  3  4  5  6  7 Total
#1       12 6  1  0  0  0  0     7
#2       26 0 11  0  0  0  0    11
#3       37 0  0 17  1  0  0    18
#4       39 0  0  0 22  0  1    23
#5       45 0  0  0  0 25  0    25
#6       48 0  0  0  0  0 28    28
Determine validity of parameters for a model (internal use)
Description
Checks to make sure specified parameters are valid for a particular type of model.
Usage
valid.parameters(model, parameters)
Arguments
| model | type of c-r model ("CJS", "Burnham" etc) | 
| parameters | vector of parameter names (for example "Phi" or "p" or "S") | 
Value
Logical; TRUE if all parameters are acceptable and FALSE otherwise
Author(s)
Jeff Laake
See Also
Variance components estimation
Description
Computes estimated effects, standard errors and process variance for a set of estimates
Usage
var.components(
  theta,
  design,
  vcv,
  alpha = 0.05,
  upper = 10 * max(vcv),
  LAPACK = TRUE
)
Arguments
| theta | vector of parameter estimates | 
| design | design matrix for combining parameter estimates | 
| vcv | estimated variance-covariance matrix for parameters | 
| alpha | sets 1-alpha confidence limit on sigma | 
| upper | upper limit for process variance | 
| LAPACK | argument passed to call to  | 
Details
Computes estimated effects, standard errors and process variance for a set
of estimates using the method of moments estimator described by Burnham and
White (2002). The design matrix specifies the manner in which the
estimates (theta) are combined.  The number of rows of the design
matrix must match the length of theta. 
If you select specific values
of theta, you must select the equivalent sub-matrix of the variance-covariance
matrix.  For instance, if the parameter indices are $estimates[c(1:5,8)]
then the appropriate definition of the vcv matrix would be vcv=vcv[c(1:5,8), c(1:5,8)], if
vcv is nxn for n estimates. Note that get.real will only return the vcv matrix of the unique 
reals so the dimensions of estimates and vcv will not always match as in the example below
where estimates has 21 rows but with the time model there are only 6 unique Phis so vcv is 6x6.
To get a mean estimate use a column matrix of 1's (e.g.,
design=matrix(1,ncol=1,nrow=length(theta)). The function returns a
list with the estimates of the coefficients for the design matrix
(beta) with one value per column in the design matrix and the
variance-covariance matrix (vcv.beta) for the beta estimates.
The process variance is returned as sigma.
Value
A list with the following elements
| sigmasq | process variance estimate and confidence interval; estimate may be <0 | 
| sigma | sqrt of process variance; set to o if sigmasq<0 | 
| beta | dataframe with estimates and standard errors of betas for design | 
| betarand | dataframe of shrinkage estimates | 
| vcv.beta | variance-covariance matrix for beta | 
| GTrace | trace of matrix G | 
Author(s)
Jeff Laake; Ben Augustine
References
BURNHAM, K. P. and G. C. WHITE. 2002. Evaluation of some random effects methodology applicable to bird ringing data. Journal of Applied Statistics 29: 245-264.
Examples
# This example is excluded from testing to reduce package check time
data(dipper)
md=mark(dipper,model.parameters=list(Phi=list(formula=~time)),delete=TRUE)
md$results$AICc
zz=get.real(md,"Phi",vcv=TRUE)
z=zz$estimates$estimate[1:6]
vcv=zz$vcv.real
varc=var.components(z,design=matrix(rep(1,length(z)),ncol=1),vcv) 
df=md$design.data$Phi
shrinkest=data.frame(time=1:6,value=varc$betarand$estimate)
df=merge(df,shrinkest,by="time")
md=mark(dipper,model.parameters=list(Phi=list(formula=~time,
  fixed=list(index=df$par.index,value=df$value))),adjust=FALSE,delete=TRUE)
npar=md$results$npar+varc$GTrace
md$results$lnl+2*(npar + (npar*(npar+1))/(md$results$n-npar-1))
Variance components estimation using REML or maximum likelihood
Description
Computes estimated effects, standard errors and variance components for a set of estimates
Usage
var.components.reml(
  theta,
  design,
  vcv = NULL,
  rdesign = NULL,
  initial = NULL,
  interval = c(-25, 10),
  REML = TRUE
)
Arguments
| theta | vector of parameter estimates | 
| design | design matrix for fixed effects combining parameter estimates | 
| vcv | estimated variance-covariance matrix for parameters | 
| rdesign | design matrix for random effect (do not use intercept form; eg use ~-1+year instead of ~year); if NULL fits only iid error | 
| initial | initial values for variance components | 
| interval | interval bounds for log(sigma) to help optimization from going awry | 
| REML | if TRUE uses reml else maximum likelihood | 
Details
The function var.components uses method of moments to estimate
a single process variance but cannot fit a more complex example.  It can
only estimate an iid process variance.  However, if you have a more
complicated structure in which you have random year effects and want to
estimate a fixed age effect then var.components will not work
because it will assume an iid error rather than allowing a common error for
each year as well as an iid error.  This function uses restricted maximum
likelihood (reml) or maximum likelihood to fit a fixed effects model with an
optional random effects structure.  The example below provides an
illustration as to how this can be useful.
Value
A list with the following elements
| neglnl | negative log-likelihood for fitted model | 
| AICc | small sample corrected AIC for model selection | 
| sigma | variance component estimates; if rdesign=NULL, only an iid error; otherwise, iid error and random effect error | 
| beta | dataframe with estimates and standard errors of betas for design | 
| vcv.beta | variance-covariance matrix for beta | 
Author(s)
Jeff Laake
Examples
# This example is excluded from testing to reduce package check time
# Use dipper data with an age (0,1+)/time model for Phi
data(dipper)
dipper.proc=process.data(dipper,model="CJS")
dipper.ddl=make.design.data(dipper.proc,
   parameters=list(Phi=list(age.bins=c(0,.5,6))))
levels(dipper.ddl$Phi$age)=c("age0","age1+")
md=mark(dipper,model.parameters=list(Phi=list(formula=~time+age)),delete=TRUE)
# extract the estimates of Phi 
zz=get.real(md,"Phi",vcv=TRUE)
# assign age to use same intervals as these are not copied 
# across into the dataframe from get.real
zz$estimates$age=cut(zz$estimates$Age,c(0,.5,6),include=TRUE)
levels(zz$estimates$age)=c("age0","age1+")
z=zz$estimates
# Fit age fixed effects with random year component and an iid error
var.components.reml(z$estimate,design=model.matrix(~-1+age,z),
        zz$vcv,rdesign=model.matrix(~-1+time,z)) 
# Fitted model assuming no covariance structure to compare to 
# results with lme
xx=var.components.reml(z$estimate,design=model.matrix(~-1+age,z),
 matrix(0,nrow=nrow(zz$vcv),ncol=ncol(zz$vcv)),
 rdesign=model.matrix(~-1+time,z)) 
xx
sqrt(xx$sigmasq)
library(nlme)
nlme::lme(estimate~-1+age,data=z,random=~1|time)
Occupancy data for Mahoenui Giant Weta
Description
An occupancy data set for modelling presence/absence data for salamanders.
Format
A data frame with 72 observations (sites) on the following 7 variables.
- ch
- a character vector containing the presence (1) and absence (0), or (.) not visited for each of 5 visits to the site 
- Browse
- 0/1 dummy variable to indicate browsing 
- Obs1
- observer number for visit 1; . used when site not visited 
- Obs2
- observer number for visit 2; . used when site not visited 
- Obs3
- observer number for visit 3; . used when site not visited 
- Obs4
- observer number for visit 4; . used when site not visited 
- Obs5
- observer number for visit 5; . used when site not visited 
Details
This is a data set that accompanies program PRESENCE and is explained on pages 116-122 of MacKenzie et al. (2006).
References
MacKenzie, D.I., Nichols, J. D., Royle, J.A., Pollock, K.H., Bailey, L.L., and Hines, J.E. 2006. Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurence. Elsevier, Inc. 324p.
Examples
#  The data can be imported with the following command using the
#  tab-delimited weta.txt file in the data subdirectory.
#   weta=import.chdata("weta.txt",field.types=c(rep("f",6)))
#  Below is the first few lines of the data file that was constructed
#  from the .xls file that accompanies PRESENCE.
#ch	Browse	Obs1	Obs2	Obs3	Obs4	Obs5
#0000.	1	1	3	2	3	.
#0000.	1	1	3	2	3	.
#0001.	1	1	3	2	3	.
#0000.	0	1	3	2	3	.
#0000.	1	1	3	2	3	.
#0000.	0	1	3	2	3	.
#
# This example is excluded from testing to reduce package check time
# retrieve weta data
data(weta)
# Create function to fit the 18 models in the book
fit.weta.models=function()
{
#  use make.time.factor to create time-varying dummy variables Obs1 and Obs2
#  observer 3 is used as the intercept
   weta=make.time.factor(weta,"Obs",1:5,intercept=3)
#  Process data and use Browse covariate to group sites; it could have also
#  been used an individual covariate because it is a 0/1 variable.
   weta.process=process.data(weta,model="Occupancy",groups="Browse")
   weta.ddl=make.design.data(weta.process)
#  time factor variable copied to Day to match names used in book
   weta.ddl$p$Day=weta.ddl$p$time
# Define p models
   p.dot=list(formula=~1)
   p.day=list(formula=~Day)
   p.obs=list(formula=~Obs1+Obs2)
   p.browse=list(formula=~Browse)
   p.day.obs=list(formula=~Day+Obs1+Obs2)
   p.day.browse=list(formula=~Day+Browse)
   p.obs.browse=list(formula=~Obs1+Obs2+Browse)
   p.day.obs.browse=list(formula=~Day+Obs1+Obs2+Browse)
# Define Psi models
   Psi.dot=list(formula=~1)
   Psi.browse=list(formula=~Browse)
# Create model list
   cml=create.model.list("Occupancy")
# Run and return marklist of models
   return(mark.wrapper(cml,data=weta.process,ddl=weta.ddl,delete=TRUE))
}
weta.models=fit.weta.models()
# Modify the model table to show -2lnl and use AIC rather than AICc
weta.models$model.table=model.table(weta.models,use.AIC=TRUE,use.lnl=TRUE)
# Show new model table which duplicates the results except they have
# some type of error with the model Psi(.)P(Obs+Browse) which should have
# 5 parameters rather than 4 and the -2lnl also doesn't agree with the results here
weta.models
#
# display beta vcv matrix of the Psi parameters (intercept + browse=1)
# matches what is shown on pg 122 of Occupancy book
weta.models[[7]]$result$beta.vcv[8:9,8:9]
# compute variance-covariance matrix of Psi0(6; unbrowsed) ,Psi1(7; browsed)
vcv.psi=get.real(weta.models[[7]],"Psi",vcv=TRUE)$vcv.real
vcv.psi
# Compute proportion unbrowsed and browsed
prop.browse=c(37,35)/72
prop.browse
# compute std error of overall estimate as shown on pg 121-122
sqrt(sum(prop.browse^2*diag(vcv.psi)))
# compute std error and correctly include covariance between Psi0 and Psi1
sqrt( t(prop.browse) %*% vcv.psi %*% prop.browse )
# show missing part of variance 2 times cross-product of prop.browse * covariance
2*prod(prop.browse)*vcv.psi[1,2]
sqrt(sum(prop.browse^2*diag(vcv.psi))+2*prod(prop.browse)*vcv.psi[1,2])
White-winged dove Jolly-Seber POPAN Analysis Details
Description
This dataset represents 2 years of capture-mark-recapture data collected on uniquely identifiable leg-banded (size 4) white-winged dove captured in Alice, Texas, USA (Latitude 27.25, Longitude -98.07) between mid-February and mid-September during 2009 and 2010. The package was developed such that others could recreate the analysis developed by Collier, B. A., S. R. Kremer, C. D. Mason, J. Stone, K. W. Calhoun, and M. J. Peterson. 2012. Immigration and recruitment in an urban white-winged dove breeding colony. Journal of Fish and Wildlife Management, In review., and see how the data and results were used to estimate population level recruitment (number juveniles in population over number adults in population).
Format
The format is 2 data frames, one for 2009 and one for 2010. 2009 has 5101 unique captures, 2010 has 3502 unique captures.
- Prefix
- Unique band prefix identifier (usually 0914 or 0984) 
- Suffix
- Unique band suffix (numeric value) 
- E0-E13
- 0/1 representing whether a dove was captured (1) or not captured (0) during that (E) sampling occasion 
- Age
- Age class with AHY=after-hatch year and HY=hatch year 
Details
White-winged doves were captured (aged: AHY=after-hatch year, HY=hatch year) continuously in baited walk-in dove traps beginning in Februrary and ending in September in each year (2009 and 2010). During 2009 5,101 white-winged doves were captured (2,894 AHY, 2,207 HY) while in 2010 3,502 white-winged doves were captured (3,106 AHY, 486 HY). We used approximately 2 week date windows to categorized our encounter histories for analysis in MARK http://www.phidot.org/software/mark/ via RMark using these dates: 27 Feb; 13 March; 27 March; 10 April; 24 April; 8 May; 22 May; 5 June; 19 June; 3 July; 17 July; 31 July; 14 August-End; giving us 13 encounter occasions.
I wanted to force b0=0 for the first time frame, as none could be there when we started as they had not arrived yet in any real number. If you take the first column out, the numbers get ridiculously screwy for the super population size and the entry parameters, because the initial population has individuals in it, thus the JSPOPAN estimates something like 40 the population pre-trapping, which is biologically impossible.
Initially, because of the parameter structure in a JS-POPAN model and the fact that the initial entry probability is 1 minus the sum of the resultant entry probabilities 
for subsequent sampling occasions, and because occasionally a couple of doves were captured during the initial time frame, we were getting entry values for the initial time period
representing >40
To obtain reasonable results a 'fake' encounter occasion (time -1) with no captures ("0") was appended to beginning of each capture history to force b0=0.
As such, the data for wwdo.09 and wwdo.10 will have 14, not 13 encounter histories.
Important to note, in case you don't read Collier et al. (2012), is the fact that the 2010 dataset is kind of screwy relative to the estimation of the entry parameters for HY wwdo.  Basically,
what happened was we caught a bunch of AHY birds, but when we captured HY birds, we only caught a few (~400 in 2010) and of those we captured, we rarely, if ever had any recaptures.  Without getting
into a bunch of speculation on what happened, as we don't really know, we suspect it had something to do with the fact that 2009 was a extreme drought in Texas, as to where 2010 was extremely wet, 
so mast based food sources (mulberry tree's are everywhere) and such were readily available in the urban environment, as such, lower trapping success.  So, when you fit the 2010 dataset using the
same candidate model set at the 2009, you get pretty nonsensical answers for the entry b parameters.  As such, we did not use a group specific entry model for 2010.  If you want more 
detail, see Collier et al. (2012).
Within the model code below, you can see that we fixed the both the Phi and p parameters for those periods in the analysis when HY birds could not be
in the population (e.g., when all birds migrated into the breeding colony and no HY birds had been produced yet).  There is probably a little slack in the range, as 
it is possible that there were some HY white-winged doves in the population in the last period we fixed, but we did not catch any there so we opted to fix it as well. 
Note that the R function wwdo.popan is set up in RMark speak, and will run either the wwdo.09 or wwdo.10 datasets as long as you specify one
in the lines where I have listed data(wwdo.09) and wwdo=wwdo.09.  If you want to see the 2010 analysis, just change those lines to wwdo.10.
References
Collier, B. A., S. R. Kremer, C. D. Mason, J. Stone, K. W. Calhoun, and M. J. Peterson. 2012. Immigration and recruitment in an urban white-winged dove breeding colony. Journal of Fish and Wildlife Management, In review.
Examples
# This example is excluded from testing to reduce package check time
# Starting with R version 4, examples wrapped in donttest are still run. Therefore,
# I have commented out the model runs below because this example takes a lot of computing
# time. If you want to run, you can copy to an editor and remove the # to run this example
data(wwdo.09)
wwdo=wwdo.09
wwdo.popan=function(){
wwdo.proc=process.data(wwdo, model="POPAN", groups="Age")
wwdo.ddl=make.design.data(wwdo.proc)
#Fixing Phi Parameters for sampling periods where HY WWDO were not available in population
	hy.phi1=as.numeric(row.names(wwdo.ddl$Phi[wwdo.ddl$Phi$group=="HY" & 
                      wwdo.ddl$Phi$time==1,]))
	hy.phi2=as.numeric(row.names(wwdo.ddl$Phi[wwdo.ddl$Phi$group=="HY" &
                      wwdo.ddl$Phi$time==2,]))
	hy.phi3=as.numeric(row.names(wwdo.ddl$Phi[wwdo.ddl$Phi$group=="HY" & 
                      wwdo.ddl$Phi$time==3,]))
	hy.phi4=as.numeric(row.names(wwdo.ddl$Phi[wwdo.ddl$Phi$group=="HY" & 
                      wwdo.ddl$Phi$time==4,]))
	hy.phi5=as.numeric(row.names(wwdo.ddl$Phi[wwdo.ddl$Phi$group=="HY" & 
                      wwdo.ddl$Phi$time==5,]))
	hy.phi6=as.numeric(row.names(wwdo.ddl$Phi[wwdo.ddl$Phi$group=="HY" & 
                      wwdo.ddl$Phi$time==6,]))
	hy.phi7=as.numeric(row.names(wwdo.ddl$Phi[wwdo.ddl$Phi$group=="HY" & 
                      wwdo.ddl$Phi$time==7,]))
	hy.phi.fix=c(hy.phi1, hy.phi2, hy.phi3, hy.phi4, hy.phi5, hy.phi6, hy.phi7)
	
#Fixing PENT Parameters for sampling period where HY WWDO were not available in population
	hy.pent2=as.numeric(row.names(wwdo.ddl$pent[wwdo.ddl$pent$group=="HY" & 
                       wwdo.ddl$pent$time==2,]))
	hy.pent3=as.numeric(row.names(wwdo.ddl$pent[wwdo.ddl$pent$group=="HY" & 
                       wwdo.ddl$pent$time==3,]))
	hy.pent4=as.numeric(row.names(wwdo.ddl$pent[wwdo.ddl$pent$group=="HY" & 
                       wwdo.ddl$pent$time==4,]))
	hy.pent5=as.numeric(row.names(wwdo.ddl$pent[wwdo.ddl$pent$group=="HY" & 
                       wwdo.ddl$pent$time==5,]))
	hy.pent6=as.numeric(row.names(wwdo.ddl$pent[wwdo.ddl$pent$group=="HY" & 
                       wwdo.ddl$pent$time==6,]))
	hy.pent7=as.numeric(row.names(wwdo.ddl$pent[wwdo.ddl$pent$group=="HY" & 
                       wwdo.ddl$pent$time==7,]))
	hy.pent.fix=c(hy.pent2, hy.pent3, hy.pent4, hy.pent5, hy.pent6, hy.pent7)
	
#####
#Real Parameter Definitions
#####
#Detection process
	p.dot=list(formula=~1)
	p.time=list(formula=~time)
	p.group=list(formula=~group)
	p.g.time=list(formula=~group:time)
	
#Survival process
	Phi.dot.fix=list(formula=~1, fixed=list(index=hy.phi.fix, value=c(0,0,0,0,0,0,0)))
	Phi.time.fix=list(formula=~time, fixed=list(index=hy.phi.fix, value=c(0,0,0,0,0,0,0)))
	Phi.age.fix=list(formula=~group, fixed=list(index=hy.phi.fix, value=c(0,0,0,0,0,0,0)))
	Phi.timeage.fix=list(formula=~time:group, 
     fixed=list(index=hy.phi.fix, value=c(0,0,0,0,0,0,0)))
	
#Entry Process-always time dependent, otherwise makes no sense in my situation
	pent.time.fix=list(formula=~time, fixed=list(index=hy.pent.fix, value=c(0,0,0,0,0,0)))
	
	#Model.1=mark(wwdo.proc, wwdo.ddl, 
 #  model.parameters=list(Phi=Phi.dot.fix, p=p.dot, pent=pent.time.fix, N=list(formula=~group)),
 #  invisible=FALSE,threads=1,options="SIMANNEAL",delete=TRUE)
	#Model.2=mark(wwdo.proc, wwdo.ddl, 
 # model.parameters=list(Phi=Phi.time.fix, p=p.dot, pent=pent.time.fix, N=list(formula=~group)),
 #  invisible=FALSE,threads=1,options="SIMANNEAL",delete=TRUE)
	#Model.3=mark(wwdo.proc, wwdo.ddl, 
 # model.parameters=list(Phi=Phi.age.fix, p=p.dot, pent=pent.time.fix, N=list(formula=~group)), 
 #   invisible=FALSE,threads=1,options="SIMANNEAL",delete=TRUE)
	#Model.4=mark(wwdo.proc, wwdo.ddl, 
 # model.parameters=list(Phi=Phi.timeage.fix, p=p.dot, pent=pent.time.fix, N=list(formula=~group)), 
 #   invisible=FALSE,threads=1,options="SIMANNEAL",delete=TRUE)
	#Model.5=mark(wwdo.proc, wwdo.ddl,  
 # model.parameters=list(Phi=Phi.timeage.fix, p=p.time, pent=pent.time.fix, N=list(formula=~group)), 
 #  invisible=FALSE,threads=1,options="SIMANNEAL",delete=TRUE)
	#Model.6=mark(wwdo.proc, wwdo.ddl, 
 #  model.parameters=list(Phi=Phi.timeage.fix,p=p.g.time, pent=pent.time.fix,
 #             N=list(formula=~group)), 
 #  invisible=FALSE,threads=1,options="SIMANNEAL",delete=TRUE)
	#collect.models()
}
wwdo.out=wwdo.popan()
wwdo.out