1 Availability of code

The code for this vignette is available in the GitHub repository noted in the Description file of this package.

2 Introduction

This document illustrates how to use the MoosePopR() and SightabilityPopR() functions to replace the MoosePop (Reed, 1989) and AerialSurvey (Unsworth, 1999) programs. The results from a 2018 survey of moose conducted according to methods described in RISC (2002) will be used for illustrative purposes.

2.1 Survey protocol

The survey area is divided in blocks (which are the primary sampling units, PSUs). Blocks could be defined using a grid-system, or based on ecological factors such as elevation, type of vegetation, etc. The blocks do not have the same area, nor do they have to rectangular. If blocks are of different size and the area of the block is not used in the analysis, no biases are introduced into the estimation procedure, but the survey design could be inefficient, i.e., the uncertainty in the estimates could be larger compared to the standard error obtained from methods that account for different block area (see later in this document).

Blocks are then stratified using a habitat model into categories of relative moose densities. In this case, the blocks are stratified into three strata corresponding to low, medium, and high density depending on past years results, vegetation cover, and other attributes. There is no limit on the number of strata that can be defined, but usually three to five blocks would be sufficient. Blocks within strata should have similar densities and strata should “as different” as possible. If the stratification is incorrect (e.g., a block is classified into the high stratum when in fact is should be in the low stratum), no biases are introduced into the estimates

A simple random sample of blocks is selected from each stratum, i.e., each block is selected with equal probability within each stratum. The number of block to select from each stratum is determined using different allocation methods. The analysis of the completed survey does not depend on how the sample size was determined in each stratum. Some selected blocks may not be surveyed due to bad weather etc. These missed blocks are assumed to be missing completely at random (MCAR) within each stratum, i.e., the missingness is unrelated to the response or any other covariate. If a missed block is not replaced by another block chosen at random from the stratum, the sample size in this stratum will be reduced. The impact of a reduction in sample size in each stratum (with increased uncertainty in the estimates for each stratum compared to a survey with no missing blocks), but no biases are introduced.

When a block is surveyed, moose are observed in groups. When a group is identified, a count of the number of moose is made separated by age (adult, juvenile, calves) and sex (bull or cow).

Not all groups of moose can be observed due to sightability issues. A previous sightability study (e.g., Quayle et al. 2001; RISC, 2002) is used to estimate the probability of detecting a group of moose based on a categorical vegetation cover variable with five classes. NOTE that sightability operates at the group level and not at the moose level. This means that if a group is sighted, then all members of the group are enumerated. But entire groups could be “invisible” due to vegetation cover.

This is a stratified random sampling design where the sampling unit is the block surveyed in each stratum and the observational unit is a group of moose. The observations on groups are pseudo-replicates and must be treated as observations within a cluster (being the block).

This document will illustrate how to estimate the total number and density of moose (in various classes) and the ratio of types of moose (e.g., bulls to cow ratio) without and with correcting for sightability. The estimates are compared to those generated by the MoosePop program (Reed, 1989; no correction for sightability) and the AerialSurvey program (Unsworth, 1999; correction for sightability).

3 Data input

Data input consists of several datasets placed into an Excel workbook using different worksheets for the different information.

3.1 Moose data

This consists of information on group of moose observed in a selected blocks. The data should have a header row with the variable names for each column.

# Get the actual survey data
dir(system.file("extdata", package = "SightabilityModel"))

## [1] "ExampleDomainStratification.xlsx" "SampleMooseSurveyData.xlsx"

survey.data <- readxl::read_excel(system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
                                  sheet="BlockData",
                                  skip=1,.name_repair="universal")
# Skip =xx says to skip xx lines at the top of the worksheet. In this case skip=1 implies start reading in line 2 of
# the worksheet.
# .name_repair="universal" changes all variable names to be compatible with R, e.g., no spaces, no special characters, etc

Variable names for the counts are somewhat arbitrary. Notice how R converts the column names from the Excel workbook to to proper R variable names (.name_repair=“universal” controls this conversion. These (modified) variable names will be used to identify the quantities to be estimated so simpler names are easier to use than more complicated names.

# Check the variable names in the input data
names(survey.data)

##  [1] "Domain"         "Block.ID"       "Stratum"        "Bulls"          "Lone.Cows"      "Cow.W.1...calf" "Cow.W.2.calves" "Lone...calf"    "Unk.Age.Sex"    "NMoose"         "..Veg.Cover"

Missing values in the count fields are assume to be zeros and a zero value must be substituted for the missing values.

# convert all NA in moose counts to 0
survey.data$Bulls          [ is.na(survey.data$Bulls)         ] <- 0
survey.data$Lone.Cows      [ is.na(survey.data$Lone.Cows)     ] <- 0
survey.data$Cow.W.1...calf [ is.na(survey.data$Cow.W.1...calf)] <- 0
survey.data$Cow.W.2.calves [ is.na(survey.data$Cow.W.2.calves)] <- 0
survey.data$Lone...calf    [ is.na(survey.data$Lone...calf)   ] <- 0
survey.data$Unk.Age.Sex    [ is.na(survey.data$Unk.Age.Sex)   ] <- 0
#head(survey.data)

We check that the NMoose is computed properly by comparing the recorded total number of moose with a computed total number of moose.

# check the total moose read in vs computed number
survey.data$myNMoose <- survey.data$Bulls +
                       survey.data$Lone.Cows+
                       survey.data$Cow.W.1...calf*2+
                       survey.data$Cow.W.2.calves*3+
                       survey.data$Lone...calf+
                       survey.data$Unk.Age.Sex

ggplot(data=survey.data, aes(x=myNMoose, y=NMoose))+
  ggtitle("Compare my count vs. count on spreadsheet")+
  geom_point(position=position_jitter(h=.1,w=.1))+
  geom_abline(intercept=0, slope=1)

We count the number of observations by sub-unit and stratum to ensure that Stratum has been coded properly. For example ensure that

stratum case is consistent (e.g., h vs. H),
stratum codes are consistent (e.g., H vs. HIGH).

xtabs(~Stratum, data=survey.data, exclude=NULL, na.action=na.pass)

## Stratum
##   H   L   M 
## 236  92 286

We can also add additional computed variables. For example, the number of cows is found as

survey.data$Cows <- survey.data$Lone.Cows + survey.data$Cow.W.1...calf + survey.data$Cow.W.2.calves

The first few records of the counts of interest are:

##   Block.ID Stratum Bulls Lone.Cows Cow.W.1...calf Cow.W.2.calves Cows Lone...calf Unk.Age.Sex NMoose
## 1        9       M     1         0              0              0    0           0           0      1
## 2        9       M     0         0              1              0    1           0           0      2
## 3        9       M     0         0              1              0    1           0           0      2
## 4        9       M     0         0              1              0    1           0           0      2
## 5        9       M     0         0              1              0    1           0           0      2
## 6        9       M     0         1              0              0    1           0           0      1

3.2 Block area

The area of each surveyed block is also needed.

# Get the area of each block
survey.block.area <- readxl::read_excel(
                        system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
                        sheet="BlockArea",
                        skip=1,.name_repair="universal")
head(survey.block.area)

## # A tibble: 6 × 2
##   Block.ID Block.Area
##      <dbl>      <dbl>
## 1        9       25.7
## 2       12       19  
## 3       13       24  
## 4       20       30  
## 5       27       26.6
## 6       31       30

The name of the variable identifying the block should match the name used in the raw data from the previous section.

names(survey.block.area)

## [1] "Block.ID"   "Block.Area"

names(survey.data)

##  [1] "Domain"         "Block.ID"       "Stratum"        "Bulls"          "Lone.Cows"      "Cow.W.1...calf" "Cow.W.2.calves" "Lone...calf"    "Unk.Age.Sex"    "NMoose"         "..Veg.Cover"   
## [12] "myNMoose"       "Cows"

Every block that is surveyed should have an area, but the block area file can have areas for blocks not surveyed (e.g., every block in the population of blocks). If a area is available for a block that was not surveyed, it is ignored.

# Check that every surveyed block has an area. The setdiff() should return null.
setdiff(survey.data$Block.ID, survey.block.area$Block.ID)

## numeric(0)

# It is ok if the block area file has areas for blocks not surveyed
setdiff(survey.block.area$Block.ID, survey.data$Block.ID)

## numeric(0)

3.3 Stratum information

Information about each stratum such as the total area and total number of blocks is required

stratum.data <- readxl::read_excel(
      system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
      sheet="Stratum",
      skip=2,.name_repair="universal")
stratum.data

## # A tibble: 3 × 3
##   Stratum Stratum.Area Stratum.Blocks
##   <chr>          <dbl>          <dbl>
## 1 L               3795            127
## 2 M               7006            242
## 3 H               2071             73

Ensure that the stratum labels match those used in the raw data

3.4 Sightability model.

A logistic regression comparing observed groups of moose to known groups of moose (e.g., based on radio collared moose) was used to estimate the sightability model (e.g., Quayle, 2001).

3.4.1 Extracting from AerialSurvey

The sightability model used was extracted from the AerialSurvey (Unsworth, 1999) program input files and entered on the worksheet in the Excel workbook. The AerialSurvey program uses two files to describe the survey (and computations) and the model for the sightability.

The Moose.mdf is a text-file that describes the model used for sightability corrections. A portion of the file is shown below

;==============================================================================
; MOOSE.MDF - Model description file (MDF) for Aerial Survey - 7-Oct-94
; -comments start with a semicolon (any text after it will be ignored)
; -blank value fields (right hand side) will default to nil, zero, or no!
;==============================================================================
[Model]
Title    = "Moose, Bell JetRanger"
Subtitle = '<Uses only the % veg cover>' ; optional
Species  = Moose
Aircraft = "Bell JetRanger"
Locale   = xxxxxx
Version  = 1.00
Terms    = 2           ; number of terms in the model, max = 10
Date     = 04.11.00

This appears to be the model for Bell JetRanger aircraft for moose. The model for sightability appears to be function only of vegetation cover with two term (intercept and vegetation cover class).

Vegetation cover appears to be divided into 5 classes:

[Transformations]
;Type of transformations can be:
;  Class, LookupT1, LookupT2, LookupT3, LookupT4, Table1, Table2, Power, None
Number = 1  ; number of transformations, max = 5
; for each transformation, give the name/index of the variable,
; and the type of transformation required on the variable
; TransformX = variable, type
Transform1 = VegCover, Class

[Class]  ; Class - e.g., vegetation cover classes into intervals
; for each interval give lower and upper limits and the assigned value
Intervals = 6  ; number of class intervals, max = 9
; ClassX = >lower, <=upper, value - ascending, sequential order
Class1 = -0.1,  20.0, 1.0
Class2 = 20.0,  40.0, 2.0
Class3 = 40.0,  60.0, 3.0
Class4 = 60.0,  80.0, 4.0
Class5 = 80.0,  100.0, 5.0

For example, if vegetation cover is from 0 to 20%, this is class 1; from 20%+ to 40% is class 2; etc.

The file also contains the estimated model coefficients and estimated covariance matrix for the sightability model..

; Map input variables into X's
[X] ; label/index from a list of variables
X01 = 1.0       ; constant
X02 = VegCover  ; class transformation

;------------------------------------------------------------------------------
; Model coefficients
;------------------------------------------------------------------------------
[Beta]   ; beta vector
Beta01 =  4.9604    ; constant
Beta02 = -1.8437    ; vegetation cover

[Sigma]  ; sigma matrix
Sigma01 =  0.7822
Sigma02 = -0.2820,  0.1115

The estimated probability of sighting a moose is \[logit(p)=4.9604 -1.8437(vegetation~class)\] Notice that vegetation class is treated as continuous variable and used directly with values of 1 to 5 rather than as 5 indicator variables. This implies that the $logit(p)$ increases linearly at the same rate as the vegetation cover class goes from 1 to 2 or from 2 to 3, etc. In some cases, sightability may depend on a categorical variable (e.g., north or south aspect). Then a series of indicator variables will have to be created, e.g AspectNorth takes the value of 1 if the aspect is north and the value of 0 if the aspect is south. If the categorical variable has $K$ classes, you will need $K-1$ indicator variables. Contact me for help on this issue.

3.4.2 Reading coefficients of the model

The information from the AerialSurvey (Unsworth, 1999) program was transferred to the Excel workbook containing the survey information for this study.

The estimated coefficients of the model and the covariance matrix (sigma matrix) of the estimated coefficients are input

# number of beta coefficients
nbeta <- readxl::read_excel(
   system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
   sheet="SightabilityModel",
   range="B2", col_names=FALSE)[1,1,drop=TRUE]
# Here Range="B2" refers to a SINGLE cell (B2) on the spreadsheet

# extract the names of the terms of the model
beta.terms <- unlist(readxl::read_excel(
     system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
     sheet="SightabilityModel",
     range=paste0("B3:",letters[1+nbeta],"3"), 
     col_names=FALSE, col_type="text")[1,,drop=TRUE], use.names=FALSE)
# Here the range is B3: C3 in the case of nbeta=2 etc
cat("Names of the variables used in the sightability model:", beta.terms, "\n")

## Names of the variables used in the sightability model: Intercept VegCoverClass

# extract the beta coefficients
beta <- unlist(readxl::read_excel(
    system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
    sheet="SightabilityModel",
    range=paste0("B4:",letters[1+nbeta],"4"), 
    col_names=FALSE, col_type="numeric")[1,,drop=TRUE], use.names=FALSE)
# Here the range is B4: C4 in the case of nbeta=2 etc
cat("Beta coefficients used in the sightability model:", beta, "\n")

## Beta coefficients used in the sightability model: 4.9604 -1.8437

# extract the beta covariance matrix
beta.cov <- matrix(unlist(readxl::read_excel(
    system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
    sheet="SightabilityModel",
    range=paste0("B5:",letters[1+nbeta],4+nbeta), 
    col_names=FALSE, col_type="numeric"), use.names=FALSE), ncol=nbeta, nrow=nbeta)
cat("Beta covariance matrix used in the sightability model:\n")

## Beta covariance matrix used in the sightability model:

beta.cov

##         [,1]    [,2]
## [1,]  0.7822 -0.2820
## [2,] -0.2820  0.1115

These values match those in the AerialSurvey (Unsworth, 1999) package.

The estimated sightability and expansion factors at the group level (inverse of sightability) are:

Estimated sightability of groups and expansion by Vegetation Cover Class
Vegetation Cover Class	logit(p)	SE logit(p)	p	SE(p)	Expansion Factor	SE(EF)
1	3.117	0.574	0.958	0.023	1.044	0.025
2	1.273	0.317	0.781	0.054	1.280	0.089
3	-0.571	0.306	0.361	0.071	2.770	0.542
4	-2.414	0.557	0.082	0.042	12.183	6.228
5	-4.258	0.866	0.014	0.012	71.676	61.195

So a single moose sighted in Vegetation Cover Class 5 is expanded to almost 72 moose (!).

3.4.3 Creation of VegCoverClass variable

We need to compute a VegCoverClass variable in the survey data. The recode() function from the car package is used.

# convert the Veg Cover to Veg.Cover.Class

xtabs(~..Veg.Cover, data=survey.data, exclude=NULL, na.action=na.pass)

## ..Veg.Cover
##    0    1    5   10   15   20   25   30   35   40   50   55 <NA> 
##   39    1  131  152   82   98   21   38   12   26    5    1    8

survey.data$VegCoverClass <- car::recode(survey.data$..Veg.Cover,
                                " lo:20=1; 20:40=2; 40:60=3; 60:80=4; 80:100=5")
xtabs(~VegCoverClass+..Veg.Cover, data=survey.data, exclude=NULL, na.action=na.pass)

##              ..Veg.Cover
## VegCoverClass   0   1   5  10  15  20  25  30  35  40  50  55 <NA>
##          1     39   1 131 152  82  98   0   0   0   0   0   0    0
##          2      0   0   0   0   0   0  21  38  12  26   0   0    0
##          3      0   0   0   0   0   0   0   0   0   0   5   1    0
##          <NA>   0   0   0   0   0   0   0   0   0   0   0   0    8

Notice that there are 8 groups of moose where the Vegetation Cover was not recorded. This has implications for estimation as noted later. Generally speaking, you should not simply delete these observations because this will introduce substantial bias in the estimates. Rather, impute a vegetation cover class that gives the highest sightability for these groups to give a lower bound on the number of moose in this block. This will reduce (but not eliminate) the bias caused by simply dropping these observations.

3.5 Key variables.

There are several key variables used to match records across the raw data, block area, and stratum data frames. The default variable names are:

Block.ID representing the block identification number. The Block.ID should not be duplicated in different strata. It can be numeric or characters.
Block.Area representing the area of each surveyed block. This should be numeric.
Stratum representing the stratum identification. It can be character or numeric does not have to be a single character.
Stratum.Blocks representing the total number of blocks in each stratum.
Stratum.Area representing the total area of each stratum in the same units as the Block.Area.

The analysis function has arguments that can be changed if the names of these key variables differ from above.

3.6 Dealing with missing and zero values

3.6.1 Group information

For each group sighted, the number of animals is recorded and may be divided by age and/or sex groupings. If no animals of a particular sex/age are seen in a group, then it should be recorded as zero rather than as missing.

In some cases, sex/age information is not available for an animal. It should be counted in a separate category for unknown sexed/aged animals. Of course the, a naive application of estimating the number of bulls may be biased downwards is some bulls are recorded in the unknown sexed/aged category; however, estimates of the total number moose will be fine.

No missing values are allowed in any count.

3.6.2 Covariate information

The sightability models use covariates at the group level to adjust for detection less than 100%. No missing values are allowed for the covariates. If the covariates are not available, then you should not simply delete the group, but rather impute a value of the covariate that gives the highest sightability to form a lower bound on the number of moose in the block because this particular block is not fully corrected for sightability.

There are two sources of bias that could occur when you drop a group with missing sightability covariate values:

If you delete the group and the block is now empty (and so vanishes), then your sample size (number of blocks surveyed) and area surveyed are too small and your total observed animals is too small. The bias could be positive or negative depending if the deleted groups are large or small.
If the number of blocks does not change (i.e., deleting this group does make the block empty), then deleting the group makes the total count for this block too small. If you keep the block with sightability set to 1, you partially correct against dropping the block, because rather than having a block correction factor of 2 (e.g., some lack of sightability due to high vegetation cover), you are using a sightability correction factor of 1.

Trying to predict the combined effect of these sources of bias is not easy. The effect of the second item is to partially correct the abundance/density estimate compared to dropping the block.

See the example in the Sightability section below on dealing with missing covariates.

3.6.3 Blocks with no observed groups.

It is possible that a block will be surveyed and NO animals seen in the entire block. In this case, you should insert a “dummy” observation for this block with counts of 0 for the number of moose to ensure that this surveyed block (with no animals seen) is properly accounted for. You can set the sightability variables to any value, because $0 \times \textrm{sightability correction}$ is still 0.

3.6.4 Block information

Each block must have a known area. Missing values are not allowed.

3.6.5 A single block in a stratum

If a stratum has only a single block surveyed, then no design based estimate of variance is possible. This is known as the lonely primarily sampling unit (lonely PSU). Ad hoc adjustments are possible; refer to http://r-survey.r-forge.r-project.org/survey/exmample-lonely.html for some suggestions.

The MoosePopR() function has an argument that can be set using

MoosePopR(...,  survey.lonely.psu="remove")

The the variance for this stratum is not considered when estimating the variance of the grand totals or density. Unfortunately, it is not possible, at this time, to estimate a ratio with some strata having lonely PSU.

The SightabilityPopR() function can deal with lonely PSUs by using within block variability (e.g., among groups) so this will provide a lower bound to the true uncertainty.

The bottom line is avoid designs where you sample only a single block in a stratum!

3.6.6 Stratum information

Each stratum must have the known total area for the MoosePop() function and the known total number of blocks for the SightabilityPopR() function.

4 Analysis

4.1 Analysis not adjusting for sightability

4.1.1 Within a stratum

4.1.1.1 Density

Suppose a sample of $n_h$ blocks were surveyed in stratum $h$. For each block, the number of moose $m_{hi}$ and the area of the block $a_{hi}$ are obtained for block $i$ in stratum $h$. The estimated density of moose per unit area in stratum $h$ is obtained as: \[\widehat{D}_h = \frac{\sum_i m_{hi}}{\sum_i a_{hi}}\] i.e., as the total number of moose in the surveyed blocks divided by the total survey area. This is a standard ratio estimator and the estimated standard error is found as using equation 6.9 of Cochran (1977). The estimator and its standard error can be automatically computed using the svyratio() function in the survey package (Lumley, 2019) of R.

Similar formula are used for estimating the density of other sex/age categories.

4.1.1.2 Abundance

The estimated total number of moose in stratum $h$ ($\widehat{M}_h$) is found by multiplying the density estimate and its standard error by the stratum area $A_h$

\[\widehat{M}_h = \widehat{D}_h \times A_h\] \[se(\widehat{M}_h) = se(\widehat{D}_h) \times A_h\] Similar formula are used for estimating the abundance of other sex/age categories.

4.1.1.3 Bull to Cow and other ratios

Suppose a sample of $n_h$ blocks were surveyed in stratum $h$. For each block, the number of bulls $b_{hi}$ and the number of cows $c_{hi}$ are obtained for block $i$ in stratum $h$. The estimated bull to cow ratio in stratum $h$ is obtained as: \[\widehat{BC}_h = \frac{\sum_i b_{hi}}{\sum_i c_{hi}}\] i.e., as the total number of bulls ($\sum_i b_{hi}$) in the surveyed blocks divided by the total number of cows ($\sum_i c_{hi}$) in the surveyed blocks. This is a standard ratio estimator and the estimated standard error is found as using equation 6.9 of Cochran (1977). The estimator and its standard error are automatically computed using the svyratio() function in the survey package (Lumley, 2019) of R.

If the number of bulls per 100 cows is wanted, simply multiply the estimate and its standard error by $100\times$.

Other ratios such as calves:cows are computed in similar ways.

4.1.2 Across all strata

4.1.2.1 Density

The overall density for the number of moose is found as a weighted averaged of the stratum specific densities: \[\widehat{D}= \sum_h \frac{A_h}{\sum_h A_h}\widehat{D}_h\]

\[se(\widehat{D}) = \sqrt{\sum_h \left( \frac{A_h}{\sum_h A_h} \right)^2se(\widehat{D}_h)^2}\]

which is equivalent to the estimated total abundance (see below) divided by the total stratum area. This is known as a separate-ratio estimator of the total and is possible because the expansion factor going from density to abundance (the stratum area) is known for each stratum.

Similar formula are used for estimating the density of other sex/age categories.

4.1.2.2 Abundance

The total abundance of moose is found by summing the estimated abundances across strata:

\[\widehat{M} = \sum_h \widehat{M}_h\] and the standard error of the overall abundance is found as: \[se(\widehat{M}) = \sqrt{\sum_h se(\widehat{M}_h)^2}\] This is known as a separate-ratio estimator of the total and is possible because the expansion factor going from density to abundance (the stratum area) is known for each stratum.

Similar formula are used for estimating the abundance of other sex/age categories.

4.1.2.3 Bull to Cow and other ratios

The estimation of the overall bull to cow ratio is computed using the double-ratio estimate (Cochran, 1977, Section 6.19) as \[\widehat{BC}_{\textit{double~ratio}}=\frac{\widehat{B}}{\widehat{C}}\] where $\widehat{B}$ and $\widehat{C}$ are the estimated total number of bull and cows individually computed. This is equivalent to the ratio of the overall density of bulls and cows. The above formula can be expanded to give

\[\widehat{BC}_{\textit{double~ratio}}=\frac{\sum_h \frac{\sum_i b_{hi}}{\sum_i a_{hi}} \times A_h} {\sum_h \frac{\sum_i c_{hi}}{\sum_i a_{hi}} \times A_h}\]

Its standard error is NOT directly available in the survey package but is readily computed by finding the estimates and the covariance of the estimates of number of bull and cows for each stratum and then using the delta method to find the overall ratio and standard error of the overall ratio.

4.1.3 The MoosePopR() function.

A MoosePopR() function was created in R to replicate the results from MoosePop (Reed, 1989). The key arguments to this function are:

survey.data - information on each group measured in the survey. This was read from the Excel workbook earlier.
survey.block.area - information about the area of each block surveyed. This was read from the Excel workbook earlier.
stratum.data - information about each stratum. This was read from the Excel workbook earlier.
density, abundance, numerator, denominator - variables for which the density, abundance, ratio are to be formed as right-sided formula (e.g., density=~Cows to estimate the density of Cows).
conf.level=0.90 indicates the confidence level for confidence intervals to compute with the default being a 90% confidence interval.

4.1.4 Sample Analyses

Results differ slightly from those reported on an Excel spreadsheet because of slight differences in areas of blocks between data provided to me and that used in MoosePop (Reed, 1989).

We are now ready to do the analysis. I’ve gathered all of the analysis results into a list object for extraction later.

The UC suffix that I have placed on the estimate name indicates it is not corrected for sightability.

I have only computed abundance and density estimates for all moose, but similar code can be used for other age/sex categories. Similarly, I have only considered the bull:cow ratio and other ratio can be computed in similar fashion.

4.1.4.1 Density of all moose

We compute the (uncorrected for sightability) density of all moose (regardless of sex or age) using the number of moose (NMoose) variable.

result <- NULL
result$"All.Density.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~NMoose                         # which density
)
result$"All.Density.UC"

##    Stratum   Var1       Var2 Var1.obs.total Var2.obs.total  estimate         SE conf.level       LCL       UCL
## 1        H NMoose Block.Area            409          448.9 0.9111161 0.12361519        0.9 0.7077872 1.1144450
## 2        L NMoose Block.Area            161          415.5 0.3874850 0.07772304        0.9 0.2596419 0.5153280
## 3        M NMoose Block.Area            499          981.4 0.5084573 0.07429716        0.9 0.3862493 0.6306653
## 4 .OVERALL   <NA>       <NA>           1069         1845.8 0.5375760 0.05055619        0.9 0.4544185 0.6207336

The estimate and SE column as the estimated (uncorrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall density is found as a weighted average (by stratum area) of the density in each stratum.

Similar function calls are used for estimating the density of other sex/age categories by changing the density argument.

4.1.4.2 Abundance of all moose

We compute the estimated (uncorrected for sightability) abundance of all moose (regardless of sex or age) by expanding the density by the Stratum.Area. The overall abundance is found as the sum of the expanded estimates.

result$"All.Abund.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # which density
)
result$"All.Abund.UC"

##    Stratum   Var1       Var2 Var1.obs.total Var2.obs.total estimate       SE conf.level       LCL      UCL Stratum.Area
## 1        H NMoose Block.Area            409          448.9 1886.921 256.0071        0.9 1465.8272 2308.016         2071
## 2        L NMoose Block.Area            161          415.5 1470.505 294.9589        0.9  985.3412 1955.670         3795
## 3        M NMoose Block.Area            499          981.4 3562.252 520.5259        0.9 2706.0629 4418.441         7006
## 4 .OVERALL   <NA>       <NA>           1069         1845.8 6919.679 650.7593        0.9 5849.2749 7990.082        12872

The estimate and SE column as the estimated (uncorrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall abundance is found as the sum of the estimates in each stratum.

Similar function calls are used for estimating the abundance of other sex/age categories by changing the density argument.

4.1.4.3 Number of bulls and number of cows

We estimate the (uncorrected for sightability) abundance of Bulls and Cows.

result$"Bulls.Abund.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~Bulls                      # which density
)
result$"Bulls.Abund.UC"

##    Stratum  Var1       Var2 Var1.obs.total Var2.obs.total  estimate        SE conf.level       LCL       UCL Stratum.Area
## 1        H Bulls Block.Area             65          448.9  299.8775  51.82288        0.9 214.63643  385.1185         2071
## 2        L Bulls Block.Area             27          415.5  246.6065 101.66226        0.9  79.38696  413.8260         3795
## 3        M Bulls Block.Area             70          981.4  499.7147  91.22475        0.9 349.66333  649.7661         7006
## 4 .OVERALL  <NA>       <NA>            162         1845.8 1046.1987 146.09168        0.9 805.89923 1286.4981        12872

result$"Cows.Abund.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~Cows                      # which density
)  
result$"Cows.Abund.UC"

##    Stratum Var1       Var2 Var1.obs.total Var2.obs.total  estimate       SE conf.level       LCL      UCL Stratum.Area
## 1        H Cows Block.Area            265          448.9 1222.5774 172.6861        0.9  938.5340 1506.621         2071
## 2        L Cows Block.Area             95          415.5  867.6895 174.7329        0.9  580.2796 1155.099         3795
## 3        M Cows Block.Area            325          981.4 2320.1039 364.3373        0.9 1720.8224 2919.385         7006
## 4 .OVERALL <NA>       <NA>            685         1845.8 4410.3709 439.4243        0.9 3687.5822 5133.160        12872

These values are computed similarly as the total abundance.

4.1.4.4 Bull to Cow and other ratios

Finally, we estimate the Bulls:Cows ratio:

result$"Bulls/Cow.UC" <- MoosePopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,numerator=~Bulls, denominator=~Cows    # which density
) 
result$"Bulls/Cow.UC"

##    Stratum  Var1 Var2 Var1.obs.total Var2.obs.total  estimate         SE conf.level       LCL       UCL
## 1        H Bulls Cows             65            265 0.2452830 0.02742502        0.9 0.2001729 0.2903932
## 2        L Bulls Cows             27             95 0.2842105 0.09651642        0.9 0.1254551 0.4429659
## 3        M Bulls Cows             70            325 0.2153846 0.02831107        0.9 0.1688170 0.2619522
## 4 .OVERALL  <NA> <NA>            162            685 0.2372133 0.02578894        0.9 0.1947943 0.2796323

Other ratios would be estimated in similar ways by changing the numerator and denominator arguments to the function call.

4.1.4.5 Final object

names(result)

## [1] "All.Density.UC" "All.Abund.UC"   "Bulls.Abund.UC" "Cows.Abund.UC"  "Bulls/Cow.UC"

4.2 Adjusting for sightability

The above computations are not adjusted for sightability, i.e., not all moose can be detected due to visibility concerns.

An adjustment for sightability can be done by also performing a study where a known number of groups of moose (e.g., radio collared) are surveyed and information about detection/non detection is recorded at the group level. A logistic regression (see previous sections) is used to estimate the probability of detection of a group given covariates (such as Vegetation Class).

Then the sightability model is used to expand the observed number of groups of moose by the a sightability correction factor (SCF). This is related to the inverse of the probability of detection, but includes a correction to account for the uncertainty in the estimates of the coefficients of the sightability model (see Fieberg, 2012, Equation 1 and later in this document).

For example, consider a model to estimate detectability given in Quayle et al. (2001):

Estimated sightability correction factor for each vegetation cover class
Veg Cover Class	Veg Cover %	Detection probability	Sightability Correction Factor
1	00-20	0.933	1.06
2	21-40	0.740	1.33
3	41-60	0.368	2.64
4	61-80	0.107	8.17
5	81-100	0.024	29.08

Notice that that, for example, that 1/= is only approximately equal to 1.3347203.

It is assumed that all animals in a group (e.g., bulls, cows, calves) are observed if a group is detected.

Let $\theta_{hjk}$ represent the expansion factor for group $k$ in block $j$ of stratum $h$. In the analyses that follow the block area is NOT used in the analysis. In its place, a simple expansion based on the number of blocks surveyed within a stratum are used. This will be less efficient (i.e., have a larger standard error) compared to an analysis that uses the individual block areas, but unless there is a high correlation between the number of animals and the block area, the loss of efficiency is small. If all blocks had the same area, the two approaches are identical.

We examined the correlation for the existing survey data and it is typically small. Cochran (1977) shows that unless the correlation is larger than a specified cutoff that depends on the coefficient of variation of the numerator and denominator variables, then there is no gain in using the block area, i.e., unless \[\rho > \frac{1}{2}\frac{SD(denominator~variable)/mean(denominator~variable)}{SD(numerator~variable)/mean(numerator~variable)} \] where $\rho$ is the correlation between the numerator and denominator variable (i.e., between abundance of the numerator variable and block area)$, then a ratio estimator using the area of blocks is not more efficient.

Estimated correlation between animal counts and block area
Stratum	Type of Animal	Corr	Cutoff
H	Bulls	0.42	0.12
H	Lone.cows	0.24	0.13
H	Cow.W.1…calf	0.13	0.12
H	Cow.W.2.calves	NA	NaN
H	Lone…calf	NA	NaN
H	Unk.Age.Sex	-0.37	0.04
H	Cows	0.25	0.15
H	NMoose	0.28	0.16
L	Bulls	0.02	0.08
L	Lone.cows	0.39	0.12
L	Cow.W.1…calf	0.55	0.15
L	Cow.W.2.calves	NA	NaN
L	Lone…calf	NA	NaN
L	Unk.Age.Sex	-0.29	0.07
L	Cows	0.47	0.14
L	NMoose	0.39	0.15
M	Bulls	-0.31	0.10
M	Lone.cows	-0.12	0.10
M	Cow.W.1…calf	-0.32	0.11
M	Cow.W.2.calves	-0.19	0.04
M	Lone…calf	-0.23	0.02
M	Unk.Age.Sex	0.17	0.05
M	Cows	-0.21	0.12
M	NMoose	-0.28	0.13
Note:
Cutoff is 0.5 ratio of sd/mean of area and abundance of each type of animal and unless the correlation exceeds the cutoff, there is no advantage in using the block area in the analysis

The correlations are not high. While about half of the response variables have correlations of abundance with block area that exceed the cutoff, the variation in the block areas is not that large, so we don’t think that also adjusting for block area is necessary. Of course, this also does not account for sightability issues.

A plot of the two variables for each types of animals shows no strong relationship between the number of observed animals and the block area:

Consequently, it not expected that ignoring block area in the analysis will have a material effect.

4.2.1 Within a stratum

4.2.1.1 Abundance

Suppose a sample of $n_h$ blocks were surveyed in stratum $h$ out of $N_h$ total number of blocks. For each block, the number of moose $m_{hjk}$ are obtained for group $k$ in block $j$ in stratum $h$. The estimated correction factor (for visibility of the group) is $\widehat{\theta}_{hjk}$ which is computed give the vector of covariates (x_{hjk}) as (see page 4 of Fieberg, 2012)

\[\widehat{\theta}_{hjk}= 1+\exp{(-x_{hjk}^t \widehat{\beta}-x_{hjk}^t \Sigma x_{hjk} / 2)}\]

Notice that the sightability correction factor is not simply $1/\textit{probability of detection}$ or $1+\exp{(-x_{hjk}^t \widehat{\beta}$ because it includes a correction factor to account for the non-linear transformation from the logit to the probability scale.

The estimated abundance of moose in stratum $h$ is obtained as: \[\widehat{M}_h = N_h \times \frac{\sum_{jk} m_{hjk}\widehat{\theta}_{hjk}}{n_h}\] i.e., as the mean number of (sightability corrected) moose per surveyed block times the total number of blocks. The standard error of this estimate is complicated because of the need to account for the estimated sightability but the theory is covered in Steinhorst and Samuel (1989), Fieberg (2012) and Wong (1996) and implemented in the AerialSurvey (Unsworth, 1999) and SightabilityModel packages of R. The latter package corrects the variance computations used by Steinhorst and Samuel (1989) and Fieberg (2012).

Similar computations are used for estimating the abundance of other sex/age categories.

4.2.1.2 Density

The estimated density of moose in stratum $h$ ($\widehat{D}_h$) is found by dividing the abundance estimate and its standard error by the stratum area $A_h$

\[\widehat{D}_h = \frac{\widehat{M{_h}}}{A_h} \] \[se(\widehat{M}_h) = \frac{se(\widehat{M}_h)}{A_h}\]

Similar computations are used for estimating the density of other sex/age categories.

4.2.1.3 Bull to Cow and other ratios

The ratio of bulls to cows within a stratum is computed as the estimated abundances in each stratum: \[\widehat{BC}_h = \frac{\widehat{Bulls}_h}{\widehat{Cows}_h}\] i.e., as the estimated total number of bulls in the stratum divided by the estimated total number of cows in the stratum. This ratio estimator is not available in the SightabilityModel package (Fieberg, 2012), but a new function was written and added to the package to add this functionality. The standard error is difficult to compute because of the need to account for the correlation between the number of bulls and moose in individual groups and the the uncertainty of the correction factors but is given in Wong (1996).

Note that if the sightability is equal for all groups, then this common sightability correction factor would cancel everywhere and you would get the same results as the ratio estimator based on the observed groups only. In most cases, the impact of sightability will be small on the ratio estimator and its standard error.

If the number of bulls per 100 cows is wanted, simply multiply the estimate and its standard error by $100\times$.

Similar computations occur for other ratios such as calves:cows.

4.2.2 Across all strata

4.2.2.1 Abundance

The total abundance of moose is found by summing the estimated abundances across strata:

\[\widehat{M} = \sum_h \widehat{M}_h\] The standard error is difficult to compute because the same sightability model is used across strata. This has been implemented in the SightabilityModel package (Fieberg, 2012) and the AerialSurvey (Unsworth, 1999) program.

Similar computations are used for estimating the abundance of other sex/age categories.

4.2.2.2 Density

The overall density of moose and its standard error is found as estimated total abundance and its standard error divided by the total area over all strata.. \[\widehat{D}= \frac{ \widehat{M}}{\sum_h A_h}\] \[se(\widehat{D}) = \frac{se(\widehat{M})}{{\sum_h A_h}}\] Because the total area of all strata is known quantity, this is straightforward.

Similar computations are used for estimating the density of other sex/age categories.

4.2.2.3 Bull to Cow and other ratios

The estimation of the overall bull to cow ratio is computed as the ratios of their respective estimated abundances \[\widehat{BC}=\frac{\widehat{Bulls}}{\widehat{Cows}}\] This ratio estimator was not available in the SightabilityModel package (Fieberg, 2012), but its functionality has been added as part of this contract. Standard errors are computed as described in Wong (1996).

Similar computations can be used for other ratios such as calves:cows.

4.2.3 The SightabilityPopR() function

I have created a function with the same calling arguments as the MoosePopR() function to estimate abundance, density, and ratios using the SightabilityModel package. Notice that additional functionality was added to the SightabilityModel package (Fieberg, 2012) to estimate ratios.

A SightabilityPopR() function has a similar calling sequence as the MoosePopR function with the same key input variables.

survey.data - information on each group measured in the survey. This was read from the Excel workbook earlier.
survey.block.area - information about the area of each block surveyed. This was read from the Excel workbook earlier.
stratum.data - information about each stratum. This was read from the Excel workbook earlier.
density, abundance, numerator, denominator - variables for which the density, abundance, ratio are to be formed as right-sided formula (e.g., density=~Cows to estimate the density of Cows).
conf.level=0.90 indicates the confidence level for confidence intervals to compute with the default being a 90% confidence interval.

Additionally, the sightability formula, the beta coefficients and the estimated covariance matrix must also be specified using the arguments:

sight.formula - the formula for the sightability model. This should use the variables in the survey.data frame
sight.beta - the beta coefficients for the sightability model
sight.beta.cov - the covariance matrix for the beta coefficients
sight.logCI - should confidence intervals for abundance be computed using a logarithmic transformation
sight.var.method - what method (either “Wong” or “SS”) should be used to estimate the variances . using the methods of Wong (1996) or Steinhorst and Samuels (2012). The default (and preferred) method is “Wong” because she corrects the variance computations used by Steinhorst and Samuel (1989) and Fieberg (2012).

These additional parameters are passed to the appropriate functions in the SightabilityModel package (Fieberg, 2012). Refer to the examples for details.

4.2.4 Sample Analyses

4.2.4.1 Dealing with missing values in the sightability covariates

Because we are correcting for sightability, we need the VegCoverClass for all groups. However there are some groups of moose with missing data for this covariate:

select <- is.na(survey.data$VegCoverClass)
survey.data[select,]

## # A tibble: 8 × 14
##   Domain Block.ID Stratum Bulls Lone.Cows Cow.W.1...calf Cow.W.2.calves Lone...calf Unk.Age.Sex NMoose ..Veg.Cover myNMoose  Cows VegCoverClass
##   <chr>     <dbl> <chr>   <dbl>     <dbl>          <dbl>          <dbl>       <dbl>       <dbl>  <dbl>       <dbl>    <dbl> <dbl>         <dbl>
## 1 A             9 M           0         0              1              0           0           0      2          NA        2     1            NA
## 2 A            27 M           0         0              0              0           0           0      0          NA        0     0            NA
## 3 A            86 L           0         0              0              0           0           0      0          NA        0     0            NA
## 4 A           182 L           0         0              0              0           0           0      0          NA        0     0            NA
## 5 B           405 M           0         0              0              0           0           0      0          NA        0     0            NA
## 6 B           408 H           0         1              0              0           0           0      1          NA        1     1            NA
## 7 B           408 H           1         2              1              0           0           0      5          NA        5     3            NA
## 8 B           408 H           0         1              0              0           0           0      1          NA        1     1            NA

If you simply delete these observations, then estimates can be severely biased because these groups would be excluded from the expansions when corrected for sightability. About the best that can be done is to assign an imputed vegetation class to these observations with the highest sightability to reduce the bias.

# change the missing values to cover class with highest sightability
survey.data$VegCoverClass[select] <- 1
survey.data[select,]

## # A tibble: 8 × 14
##   Domain Block.ID Stratum Bulls Lone.Cows Cow.W.1...calf Cow.W.2.calves Lone...calf Unk.Age.Sex NMoose ..Veg.Cover myNMoose  Cows VegCoverClass
##   <chr>     <dbl> <chr>   <dbl>     <dbl>          <dbl>          <dbl>       <dbl>       <dbl>  <dbl>       <dbl>    <dbl> <dbl>         <dbl>
## 1 A             9 M           0         0              1              0           0           0      2          NA        2     1             1
## 2 A            27 M           0         0              0              0           0           0      0          NA        0     0             1
## 3 A            86 L           0         0              0              0           0           0      0          NA        0     0             1
## 4 A           182 L           0         0              0              0           0           0      0          NA        0     0             1
## 5 B           405 M           0         0              0              0           0           0      0          NA        0     0             1
## 6 B           408 H           0         1              0              0           0           0      1          NA        1     1             1
## 7 B           408 H           1         2              1              0           0           0      5          NA        5     3             1
## 8 B           408 H           0         1              0              0           0           0      1          NA        1     1             1

We are now ready to do the analysis. I’ve gathered all of the analysis results into a list object for extraction later.

The C suffix that I have placed on the estimate name indicates it is corrected for sightability.

I have only computed abundance and density estimates for all moose, but similar code can be used for other age/sex categories. Similarly, I have only considered the bull:cow ratio and other ratios can be estimated in similar fashion.

4.2.4.2 Abundance of all moose

We compute the (corrected for sightability) abundance of all moose (regardless of sex or age) using the number of moose (NMoose) variable.

result <- NULL
result$"All.Abund.C" <- SightabilityPopR(
      survey.data        = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- result$"All.Abund.C"
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp

##    Stratum   Var1 Var2 Var1.obs.total Var2.obs.total estimate  SE conf.level  LCL  UCL tau.hat VarTot VarSamp VarSight VarMod
## 1        H NMoose   NA            409             NA     1991 290        0.9 1514 2467    1991  83874   80172     1299   2402
## 2        L NMoose   NA            161             NA     1645 360        0.9 1053 2237    1645 129500  121646     4794   3059
## 3        M NMoose   NA            499             NA     3948 590        0.9 2978 4919    3948 348050  311067    23882  13100
## 4 .OVERALL   <NA>   NA           1069             NA     7584 768        0.9 6321 8847    7584 589625  512886    29976  46764

The estimate and SE column as the estimated (corrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall abundance is found as the sum of the estimates in each stratum. The right most columns are information about the sources of uncertainty (due to sampling, sightability, or the model for sightability).

Similar function calls are used for estimating the abundance of other sex/age categories by changing the abundance argument.

We can show the impact of simply deleting those values missing the VegCoverClass covariate by setting the observed values of moose to .001 for these groups. This will retain the correct number of blocks surveyed. If you simply deleted those groups of moose with missing VegCoverClass values and a block then had no observations, the block would “disappear” from the analysis which is not correct.

# mimic impact of deleting observations with missing VegCoverClass
survey.data2 <- survey.data
survey.data2$NMoose[select] <- .001
wrong.way <- SightabilityPopR( # show impact of deleting observations but keeping # blocks the same
      survey.data        = survey.data2         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
wrong.way

##    Stratum   Var1 Var2 Var1.obs.total Var2.obs.total estimate       SE conf.level      LCL      UCL  tau.hat    VarTot   VarSamp  VarSight    VarMod
## 1        H NMoose   NA        402.003             NA 1957.543 276.4433        0.9 1502.834 2412.251 1957.543  76420.91  72801.74  1277.509  2341.656
## 2        L NMoose   NA        161.002             NA 1645.393 359.8552        0.9 1053.484 2237.303 1645.393 129495.79 121642.25  4794.383  3059.154
## 3        M NMoose   NA        497.003             NA 3933.512 588.6957        0.9 2965.193 4901.830 3933.512 346562.62 309641.47 23874.233 13046.925
## 4 .OVERALL   <NA>   NA       1060.008             NA 7536.448 761.8704        0.9 6283.282 8789.613 7536.448 580446.44 504085.46 29946.124 46414.851

Notice the negative bias in the estimated abundance computed the incorrect way (7536) vs. the value when the VegCoverClass is set to the class with highest sightability (7584) computed above.

In the real world, if a group has missing values for covariates using in the sightability model, do not change the count data, but change the missing covariates to values that result in the highest possible sightability for this group. In this way, the number of blocks surveyed is computed properly and you have partially the bias in dropping these groups by including them with an imputed high sightability value.

4.2.4.3 Density of all moose

We compute the estimated (corrected for sightability) density of all moose (regardless of sex or age).

result$"All.Density.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp<- result$"All.Density.C"
temp[,-(1:3)] <- round(temp[,-(1:3)]+.00001,2)  # force to be numeric
temp

##    Stratum   Var1 Var2 Var1.obs.total Var2.obs.total estimate   SE conf.level  LCL  UCL tau.hat    VarTot   VarSamp VarSight   VarMod Stratum.Area
## 1        H NMoose   NA            409             NA     0.96 0.14        0.9 0.73 1.19 1990.67  83873.94  80172.48  1299.20  2402.26         2071
## 2        L NMoose   NA            161             NA     0.43 0.09        0.9 0.28 0.59 1645.37 129499.99 121646.49  4794.38  3059.12         3795
## 3        M NMoose   NA            499             NA     0.56 0.08        0.9 0.43 0.70 3948.26 348049.57 311067.03 23882.05 13100.49         7006
## 4 .OVERALL   <NA>   NA           1069             NA     0.59 0.06        0.9 0.49 0.69 7584.30 589625.35 512885.99 29975.63 46763.73        12872

The estimate and SE column as the estimated (corrected for sightability) density of all moose with a measure of uncertainty for each stratum. The overall density is found as a estimated abundance/total area over all strata. The variance components are the same for the total abundance because these values are used to derive the density.

Similar function calls are used for estimating the density of other sex/age categories by changing the density argument.

4.2.4.4 Number of bulls and number of cows

We estimate the (corrected for sightability) abundance of Bulls and cows.

result$"Bulls.Abund.C" <- result$"All.Density.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~Bulls                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
result$"Bulls.Abund.C"

##    Stratum  Var1 Var2 Var1.obs.total Var2.obs.total   estimate         SE conf.level        LCL       UCL   tau.hat    VarTot   VarSamp  VarSight     VarMod Stratum.Area
## 1        H Bulls   NA             65             NA 0.15160119 0.02898037        0.9 0.10393273 0.1992697  313.9661  3602.201  3421.794  124.9116   55.49554         2071
## 2        L Bulls   NA             27             NA 0.07473830 0.03072910        0.9 0.02419343 0.1252832  283.6319 13599.513 12786.485  702.1133  110.91511         3795
## 3        M Bulls   NA             70             NA 0.07946567 0.01502809        0.9 0.05474667 0.1041847  556.7365 11085.303  9569.687 1247.2808  268.33503         7006
## 4 .OVERALL  <NA>   NA            162             NA 0.08967794 0.01322383        0.9 0.06792667 0.1114292 1154.3344 28973.872 25777.966 2074.3057 1121.60046        12872

result$"Cows.Abund.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,density=~Cows                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
result$"Cows.Abund.C"

##    Stratum Var1 Var2 Var1.obs.total Var2.obs.total  estimate         SE conf.level       LCL       UCL   tau.hat    VarTot   VarSamp   VarSight     VarMod Stratum.Area
## 1        H Cows   NA            265             NA 0.6213613 0.09387521        0.9 0.4669503 0.7757722 1286.8392  37797.41  36163.85   601.7188  1031.8369         2071
## 2        L Cows   NA             95             NA 0.2526849 0.05723467        0.9 0.1585422 0.3468275  958.9391  47178.26  44630.73  1575.6599   971.8729         3795
## 3        M Cows   NA            325             NA 0.3673888 0.06188781        0.9 0.2655925 0.4691852 2573.9262 187996.80 168711.74 13669.7257  5615.3341         7006
## 4 .OVERALL <NA>   NA            685             NA 0.3744332 0.04139389        0.9 0.3063463 0.4425201 4819.7044 283899.43 249506.33 15847.1044 18545.9933        12872

These values are computed similarly as the total abundance.

4.2.4.5 Bull to Cow and other ratios

Finally we estimates the Bulls/Cow ratio using the corrected for sightability values.

result$"Bulls/Cow.C" <- SightabilityPopR(
      survey.data       = survey.data         # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,numerator=~Bulls, denominator=~Cows                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)

## [1] "Dropping records with 0 animals in both numerator and denominator in the observational data set"
## [1] "Dropping records with 0 animals in both numerator and denominator in the observational data set"
## [1] "Dropping records with 0 animals in both numerator and denominator in the observational data set"
## [1] "Dropping records with 0 animals in both numerator and denominator in the observational data set"

temp<- result$"Bulls/Cow.C"
temp[,-(1:3)] <- round(temp[,-(1:3)]+.00001,3)  # force to be numeric
temp

##    Stratum  Var1 Var2 Var1.obs.total Var2.obs.total estimate    SE conf.level   LCL   UCL ratio.hat VarRatio VarSamp VarSight VarMod
## 1        H Bulls Cows             65            265    0.244 0.027        0.9 0.200 0.288     0.244    0.001      NA       NA     NA
## 2        L Bulls Cows             27             95    0.296 0.102        0.9 0.128 0.464     0.296    0.010      NA       NA     NA
## 3        M Bulls Cows             70            325    0.216 0.027        0.9 0.171 0.261     0.216    0.001      NA       NA     NA
## 4 .OVERALL  <NA> <NA>            162            685    0.240 0.027        0.9 0.196 0.283     0.240    0.001      NA       NA     NA

The ratio is computed within each stratum and overall The sources of variation computations are set to missing as they are not sensible in this case.

Notice that this ratio estimator is not that different from that computed from values uncorrected for sightability. This is a consequence of the approximately equal sightability correction factor for all groups.

Other ratios can be computed by changing the numerator and denominator arguments in the function call.

4.2.4.6 Final object

The result is a list object with one element each of the analyses done above.

names(result)

## [1] "All.Abund.C"   "All.Density.C" "Bulls.Abund.C" "Cows.Abund.C"  "Bulls/Cow.C"

5 Domain estimation

Domain estimation refer to estimating density, abundance, or ratios for subsets of the survey area. For example, in British Columbia, a survey may take place over a combined Buckley Valley and Lake District (sub units) survey area, but estimates of abundance are also wanted for Buckley Valley and Lake District individually. In this case, the sub-units are domains.

There are four cases to consider depending if information about the primary sampling units (PSU) (i.e., blocks) and stratum areas are known or unknown at the domain level $\times$ the domain separation operates at the PSU (block) or group level.

5.1 Domain at PSU (block) level; domain population information known

This is the simplest case. Here the primary sampling units (PSU) (or blocks) are classified as belonging to the domain or not. For example, the study area could be divided into two or more management units and each PSU (block) belongs to one and only one management units in its entirety. The total area of the management unit and the total number of potential PSUs is known for each management unit.

Because PSUs (blocks) were randomly selected within each stratum with equal probability, they are also randomly selected from each domain (management unit).

Consequently, just subset the survey data (observed animals) to those PSUs (blocks) belonging to the domain of interest and change the stratum information to refer to the total area and total PSUs (block) for that domain and estimation for density, abundance, and ratios occurs exactly as before.

5.1.1 Example

For example, the survey data used in the examples was classified in Domain $A$ and $B$, and a separate worksheet in the workbook has information about stratum information for Domain $A$. We subset the survey data, read in the stratum information for Domain $A$ and then use the same methods as shown before.

The PSUs (blocks) in the example are split between two domains (A and B) as shown below:

# PSU's are split between the domains
xtabs(~Domain+Block.ID, data=survey.data, exclude=NULL, na.action=na.pass)

##       Block.ID
## Domain  9 12 13 20 27 31 39 56 83 85 86 97 101 117 119 123 128 129 136 142 143 146 150 156 160 168 175 181 182 198 207 216 218 226 232 240 244 246 256 260 269 295 299 308 314 316 317 318 327 342 348
##      A 13 14  7  4  1 11 32 12 14  9  1  6   7  29  14  14  18   9   3  10   6  11   5   4  12   6   4   7   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
##      B  0  0  0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   8   9  13   8   4   6   2   4   2  11   1   3  23   8   2   3   2  10   4   1   6  17
##       Block.ID
## Domain 351 352 354 360 379 380 398 401 405 408 409 428 436
##      A   0   0   0   0   0   0   0   0   0   0   0   0   0
##      B  19   8  21  10  15   6  13  14   1  31   3  15  27

The number of groups of moose in each block is shown.

We subset the survey data:

# Subset the survey data
survey.data.A <- survey.data[ survey.data$Domain == "A",]

The block information (block area) is unchanged and does not need to be subset.

We need information at the stratum level for domain A which is available on the workbook:

# Domain information for the population for each stratum
stratum.data.A <- readxl::read_excel(
      system.file("extdata", "SampleMooseSurveyData.xlsx", package = "SightabilityModel", mustWork=TRUE),
      sheet="Stratum-DomainA",
      skip=2,.name_repair="universal")

kable(stratum.data.A, row.names=FALSE,
      caption="Stratum totals for Domain A",
      col.names=c("Stratum","Stratum Area","Stratum # blocks")) %>%
      column_spec(column=c(1:3),         width="2cm") %>%
      kable_styling("bordered",position = "center", full_width=FALSE)

Stratum totals for Domain A
Stratum	Stratum Area	Stratum # blocks
L	2453	65
M	4000	120
H	950	35

We can now estimate abundance, density, or ratios for Domain A in the same way as before with and without corrections for sightability. We show only the estimated abundance for all moose after correcting for sightability. Note the use of survey.data.A and stratum.data.A*

All.Abund.A.corrected <- SightabilityPopR(
      survey.data       = survey.data.A         # raw data for domain A
      ,survey.block.area = survey.block.area    # area of each block
      ,stratum.data      = stratum.data.A       # stratum information for domain A
      ,abundance=~NMoose                        # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- All.Abund.A.corrected
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp[,1:10]

##    Stratum   Var1 Var2 Var1.obs.total Var2.obs.total estimate  SE conf.level  LCL  UCL
## 1        H NMoose   NA            197             NA     1239 203        0.9  905 1574
## 2        L NMoose   NA            106             NA      978 273        0.9  530 1426
## 3        M NMoose   NA            198             NA     1712 254        0.9 1295 2129
## 4 .OVERALL   <NA>   NA            501             NA     3929 434        0.9 3215 4643

5.2 Domain at PSU (block) level; domain population information unknown.

In some cases, only the surveyed PSUs can be classified by domain and the stratum total area and stratum total blocks is unknown for the domain. Cochran (1977, Sections 2.13 and 5A.14) discuss this case.

Abundance and ratios can be estimated by setting all survey data to 0 for blocks/groups not part of the domain, and then using the estimating functions on the complete data after zeroing. This may seem strange, but consider the following table

                    PSU
Domain     A     A     A     A     B     B     B     B    B    B
Y          2     3     3     4     5     3     2     3    2    4

The total of $Y$ in Domain $A$ is $2+3+3+4=12$. If we consider the modified data

                     PSU
Domain     A     A     A     A     B     B     B     B    B    B
Y'         2     3     2     3     0     0     0     0    0    0

and find the mean of $\overline{Y'}=(2+3+3+4+0+0+0+0+0+0)/10=1.2$ and multiply the mean of Y’ by the population number of PSU (10) we get the domain total of $Total(y)=\overline{Y'} \times 10=1.2 \times 10=12$.

5.2.1 Example

We set the survey data to 0 if the domain is not A (notice that we do NOT delete observations).

# Set number of moose to zero if not part of domain A
survey.data.A.z <- survey.data
survey.data.A.z$NMoose[ survey.data$Domain != "A"] <- 0

We now use the modified survey data (survey.data.A.Z but the original group and stratum data:

All.Abund.A.corrected.z <- SightabilityPopR(
      survey.data        = survey.data.A.z    # raw data with non-domain counts set to zero
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- All.Abund.A.corrected.z
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp[,1:10]

##    Stratum   Var1 Var2 Var1.obs.total Var2.obs.total estimate  SE conf.level  LCL  UCL
## 1        H NMoose   NA            197             NA      969 321        0.9  442 1497
## 2        L NMoose   NA            106             NA     1092 387        0.9  455 1729
## 3        M NMoose   NA            198             NA     1523 353        0.9  942 2104
## 4 .OVERALL   <NA>   NA            501             NA     3584 620        0.9 2565 4604

The estimates will be similar (but not exactly the same) between this analysis and the previous section.

Estimates of ratios can be found in the same way because these are based on abundances

However, estimates of density cannot be found using this method. Refer to Cochran (1977, Section 5A.14, Estimating Domain Means) for more details.

5.3 Domain at group level; domain population information known

The analysis in this case is similar to the comparable case when the domain occurs at the PSU (block) level. Now groups are classified as belonging to the domain or not. For example, interest may lie in the number of moose in each vegetation class.

All blocks are used, but now in each block, you will subset the groups that belong to the domain. The hard part is the area in this domain of EACH PSU (block) must also be known. Presumably, this is a GIS exercise in the case of vegetation cover, assuming that entire survey area has been mapped.

Similarly, the total area in the stratum for this domain must be known. This is presumably a GIS exercise.

The analysis proceeds by subsetting the survey data to groups that fall within this domain, but you need to ensure that groups without this domain are still included (with 0 counts for the response variable). You also need to use the modified block areas and modified stratum area in this domain. Please contact me if you are attempting to do this as the programming R is tricky, to say the least.

5.4 Domain at group level; domain population information unknown.

In some cases, only the surveyed groups can be classified by domain and the block area, stratum total area and stratum total blocks is unknown for the domain.

We use the same trick as above by setting the survey data to 0 if the domain is not A (notice that we do NOT delete observations).

5.4.1 Example

Suppose we want moose abundance by vegetation cover class. The estimate of total abundance is

All.Abund <- SightabilityPopR(
      survey.data        = survey.data        # raw data
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
)
temp <- All.Abund
temp[,-c(1:3,8)] <- round(temp[,-c(1:3,8)]+.00001,0)  # force to be numeric
temp[,1:10]

##    Stratum   Var1 Var2 Var1.obs.total Var2.obs.total estimate  SE conf.level  LCL  UCL
## 1        H NMoose   NA            409             NA     1991 290        0.9 1514 2467
## 2        L NMoose   NA            161             NA     1645 360        0.9 1053 2237
## 3        M NMoose   NA            499             NA     3948 590        0.9 2978 4919
## 4 .OVERALL   <NA>   NA           1069             NA     7584 768        0.9 6321 8847

We will compute the total abundance for each vegetation cover class

All.Abund.vc <- plyr::llply(unique(survey.data$VegCoverClass), function(VegCover){
   # Set number of moose to zero if not part of domain A
   survey.data.z <- survey.data
   survey.data.z$NMoose[ survey.data$VegCoverClass != VegCover] <- 0  # set to zero
   #browser()
   All.Abund.vc <- SightabilityPopR(
      survey.data        = survey.data.z      # raw data that has been zeroed out
      ,survey.block.area = survey.block.area  # area of each block
      ,stratum.data      = stratum.data       # stratum information
      ,abundance=~NMoose                      # quantifies to estimate
      ,sight.formula     = observed ~ VegCoverClass # sightability mode;
      ,sight.beta        = beta                     # the beta coefficients for the sightability model
      ,sight.beta.cov    = beta.cov                 # the covariance matrix for the beta coefficients
      ,sight.var.method  = "Wong"                   # method  used to estimate the variances 
   )
   list(VegCoverClass=VegCover, est=All.Abund.vc)
})

# extract the total abundance

All.Abund.vc.df <- plyr::ldply(All.Abund.vc, function(x){
   #browser()
   res <- data.frame(VegCoverClass=as.character(x$VegCoverClass), stringsAsFactors=FALSE)
   res$estimate  <- x$est[x$est$Stratum==".OVERALL", c("estimate")]
   res$SE        <- x$est[x$est$Stratum==".OVERALL", c("SE"      )]
   res
   
})
All.Abund.vc.df <- plyr::rbind.fill( All.Abund.vc.df,
                                     data.frame(VegCoverClass=".OVERALL", estimate=sum(All.Abund.vc.df$estimate, stringAsFactors=FALSE)))
All.Abund.vc.df

##   VegCoverClass  estimate       SE
## 1             1 5862.2862 506.1674
## 2             2 1406.9505 258.7836
## 3             3  315.0633 185.5436
## 4      .OVERALL 7584.3000       NA

The estimates sum to the estimate firstly computed, but the estimates from the individual vegetation cover classes are not independent and so the overall standard error cannot be found from the separate estimates by vegetation cover class.

Estimates of ratios can be found in the same way because they are based on abundances.

However, estimates of density cannot be found using this method. Refer to Cochran (1977, Section 5A.14, Estimating Domain Means) for more details.

6 References

Cochran, W.G. (1977). Sampling techniques. 3rd Edition. Wiley. New York.

Fieberg, J. (2012). Estimating Population Abundance Using Sightability Models: R SightabilityModel Package. Journal of Statistical Software, 51(9), 1-20. URL https://doi.org/10.18637/jss.v051.i09.

Gasaway, W.C., S.D. Dubois, D.J. Reed and S.J. Harbo. (1986). Estimating moose population parameters from aerial surveys. Biol. Pap. Univ. Alaska, No. 22. Institute of Arctic Biology, U. of Alaska, Fairbanks. 108 pp.

Lumley, T. (2019) “survey: analysis of complex survey samples”. R package version 3.35-1.

Quayle, J. F., A. G. MacHutchon, and D. N. Jury. (2001). Modeling moose sightability in south-central British Columbia. Alces 37:43–54.

R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

Resource Inventory Standards Committee (RISC). (2002). Aerial-based inventory methods for selected ungulates bison, mountain goat, mountain sheep, moose, elk, deer, and caribou. Version 2. Standards for Components of British Columbia’s Biodiversity No 32. British Columbia Ministry of Sustainable Resource Management, Victoria, B.C.

Reed, D.J. (1989). MOOSEPOP program documentation and instructions. Alaska Department of Fish and Game. Unpublished report.

Steinhorst, K. R. and Samuel, M. D. (1989). Sightability Adjustment Methods for Aerial Surveys of Wildlife Populations. Biometrics 45:415–425.

Unsworth JW, A LF, O GE, Leptich DJ, Zager P (1999). Aerial Survey: User’s Manual. Idaho Department of Fish and Game, electronic edition edition.

Wong, C. (1996). Population size estimation using the modified Horvitz-Thompson estimator with estimated sighting probabilities. Dissertation, Colorado State University, Fort Collins, USA.

c-MoosePopR and SightabilityPopR Demonstration

Carl James Schwarz

2023-08-19

1 Availability of code

2 Introduction

2.1 Survey protocol

3 Data input

3.1 Moose data

3.2 Block area

3.3 Stratum information

3.4 Sightability model.

3.4.1 Extracting from AerialSurvey

3.4.2 Reading coefficients of the model

3.4.3 Creation of VegCoverClass variable

3.5 Key variables.

3.6 Dealing with missing and zero values

3.6.1 Group information

3.6.2 Covariate information

3.6.3 Blocks with no observed groups.

3.6.4 Block information

3.6.5 A single block in a stratum

3.6.6 Stratum information

4 Analysis

4.1 Analysis not adjusting for sightability

4.1.1 Within a stratum

4.1.1.1 Density

4.1.1.2 Abundance

4.1.1.3 Bull to Cow and other ratios

4.1.2 Across all strata

4.1.2.1 Density

4.1.2.2 Abundance

4.1.2.3 Bull to Cow and other ratios

4.1.3 The MoosePopR() function.

4.1.4 Sample Analyses

4.1.4.1 Density of all moose

4.1.4.2 Abundance of all moose

4.1.4.3 Number of bulls and number of cows

4.1.4.4 Bull to Cow and other ratios

4.1.4.5 Final object

4.2 Adjusting for sightability

4.2.1 Within a stratum

4.2.1.1 Abundance

4.2.1.2 Density

4.2.1.3 Bull to Cow and other ratios

4.2.2 Across all strata

4.2.2.1 Abundance

4.2.2.2 Density

4.2.2.3 Bull to Cow and other ratios

4.2.3 The SightabilityPopR() function

4.2.4 Sample Analyses

4.2.4.1 Dealing with missing values in the sightability covariates

4.2.4.2 Abundance of all moose

4.2.4.3 Density of all moose

4.2.4.4 Number of bulls and number of cows

4.2.4.5 Bull to Cow and other ratios

4.2.4.6 Final object

5 Domain estimation

5.1 Domain at PSU (block) level; domain population information known

5.1.1 Example

5.2 Domain at PSU (block) level; domain population information unknown.

5.2.1 Example

5.3 Domain at group level; domain population information known

5.4 Domain at group level; domain population information unknown.

5.4.1 Example

6 References