d-Domain Stratification - Method of Heard 2008

Carl James Schwarz

2023-08-19

1 Availability of code

The code for this vignette is available in the GitHub repository noted in the Description file of this package.

2 Introduction

Domain stratification was introduced by Heard et al. (2008) as opposed to standard stratification. In this vignette we:

3 Types of stratification

Stratification is a device to improve precision by grouping sampling-units into more homogeneous groups and the sampling these homogeneous groups.

There are two-types of stratification- across sampling unit stratification and within-sampling-unit stratification (more often called domain stratification).

3.1 Across sampling-unit stratification

The traditional use of the term stratification is the among sampling-unit stratification where sampling units are classified into one and only stratum, and separate, independent samples are taken from each stratum

For example Gasaway et al (1986, p.8) defines stratification as

The stratified sampling design is a method that reduces variance. It is used to pool SUs into strata of different moose density, thereby assigning as much total variance as possible to difference among strata.

So each sampling unit is assigned to one and only stratum and a sampling unit cannot belong to more than one stratum.

Similarly, BC RISC (2002) states:

The precision of a population estimate can be improved by careful stratification of sampling units before the survey. This involves stratifying the units into categories of expected animal numbers or density, based on interpretation of existing information.

Again, each survey unit belongs to one and only one stratum, and each stratum is sampled separately.

Conceptually, an across-sampling-unit stratification takes the form:

The two colors represent different strata. While the strata are shown with contiguous sampling units, strata do not have to consist of contiguous units – all that is needed is that sampling units are grouped into more homogeneous sets and a separate sample is taken from each stratum.

3.2 Within sampling-unit stratification (domain estimation)

Heard et al (2008) introduced a different form of stratification - namely within-sampling-unit stratification (more properly called domain stratification).

To determine stratum-specific SU’s, a grid of approximately 9 km2 (3.2 × 2.8 km) cells was laid out over the study area, … All polygons within each grid cell were classified as S1, S2, or outside of the survey zone (land >1200 m elevation and large lakes). …adjacent cells were arbitrarily amalgamated until the sum of all the S1 polygon areas added up to >4 km2. The high population density SU was the set of all S1 polygons within that group of cells, and the low population density SU was the set of all S2 polygons within that group of cells.

To estimate moose numbers in the high population density stratum, a random sample of SUs was chosen from the entire study area and all moose were counted within all the S1 polygons in each selected SU. To estimate moose numbers in the low population density stratum, a random sub-sample of SUs from the first sample was selected, and moose were counted within all the S2 polygons in those SUs.

Now each sampling unit contains both domains (S1 and S2). The sampling design is two-phase method where a random sample of SU was first selected from the population. All selected SU had the number of moose observed on S1. A sub-sample of the first sample was then taken, and only on the sub-sample were moose observed on S2.

Conceptually, this survey looks like:

The rectangular objects represent the sampling units. Each sampling unit contains both domains indicated by different colors. Domain Stratum 1 (red) is measured on all selected sampling units. Domain Stratum 2 (blue) is only measured on a sub-sample of the first sample.

Domains within sampling units do not have to be the same size. In some cases, sampling units could be missing one or more of the domains. In some case, both domains are measured on all sampling units.

3.3 Combined designs

Across- and within-sampling-unit stratification can both occur in a design. For example, across-sampling-unit stratification may refer to geographical area; while within-unit stratification is used within one stratum and not the other.

Conceptually, you could have a design with both types of stratification:

Here the survey area was initially stratified into two across-sampling unit strata (green vs. blue/red). In the “green” across-sampling-unit stratum, a random sample of sampling-units was selected and measured. In the “blue/red” stratum, a sample of sampling-units was selected, and the “red” domain stratum measured on all selected units. A sub-sample of the selected units then was measured on the “blue” domain stratum.

3.4 CAUTIONS

As shown above, the two types of stratification are quite different and it is unfortunate that the term “stratification” is used for both types of designs. Traditionally, the term stratification is reserved for the among-sampling-unit strata where each sampling-unit can belong to one and only one stratum, and the term domain estimation is reserved for the within-sampling-unit stratification.

This has implications when using existing software. For example the MoosePopR() and SightabilityPopR() functions in the SightabilityModel package of R, assumes that strata are of the across-unit variety and is not designed to deal with the within-unit stratification.

Additional functions, MoosePopR_DomStrat() and SightabilityPopR_DomStrat() have been included in the package to (partially) deal with domain stratification. However, these functions do not account for the impact of the correlation among the measurements on the domains in the same sampling unit.

It is difficult to determine analytically the extent of the bias of the uncertainty caused by measuring multiple domains on the same sampling unit, but three cases can illustrate the potential magnitudes of the problem.

3.4.1 Perfectly positively correlated values in domains.

Suppose that each sampling unit is measured on both domains with exactly the same number of moose seen in each domain. The total number of sampling units is 100. For example, here is some “fake data”.

##    unit S1 S2
## 1     1  8  8
## 2     2 11 11
## 3     3 11 11
## 4     4 11 11
## 5     5 11 11
## 6     6 10 10
## 7     7  9  9
## 8     8  4  4
## 9     9  9  9
## 10   10  9  9

A simple mean/sampling unit estimator will be used, i.e., the total abundance is estimated as the mean number of moose per sampling unit \(\times\) the number of sampling units in the population.

The mean count/unit for both domains is is 9.3 (SE 0.68). The estimated abundance for each domain is 100x the mean count/unit or 930 (SE 68.39). Finally, the grand total would be twice this value since each domain has exactly the same values or 1860 with a naive standard error found by adding the variances of the two domains and then taking the sqrt giving an SE for the grand total of 96.72.

However, suppose we add the values in each unit across the domains. The mean per unit is now 18.6 (SE 1.37) (twice the mean per unit for each domain) and the estimated total abundance is exactly the same 1860 (SE 136.79) but the CORRECT standard error is larger (by a factor of \(\sqrt{2}\)) compared to the naive standard error

3.4.2 Perfectly negative correlated values in domains.

Similarly, if values in the two domains are negatively correlated, the naive standard errors formed by adding the variances from the two domain strata will be too large.

Again, consider a simple example:

##    unit S1 S2
## 1     1  8 12
## 2     2 11  9
## 3     3 11  9
## 4     4 11  9
## 5     5 11  9
## 6     6 10 10
## 7     7  9 11
## 8     8  4 16
## 9     9  9 11
## 10   10  9 11

Here the total number of moose in each unit is always 20, but split between the two domains.

The mean count/unit each domains is 9.3 (SE 0.68) for domain S1, and 10.7 (SE 0.68) for domain S2. The estimated abundance for each domain is 100x the mean count/unit or 930 (SE 68.39) for domain S1 and 1070 (SE 68.39) for domain S2. Finally, the grand total estimate is 2000 with a naive standard error found by adding the variances of the two domains and then taking the sqrt of 96.72.

However, suppose we add the values in each unit across the domains. The mean per unit is now 20 (SE 0) because each sampling unit has the same total count over both domains. The estimated total abundance is exactly the same, but the CORRECT standard error is now 0, much smaller than the naive standard error seen earlier.

3.4.3 Uncorrelated values in domains.

Finally, if values in the two domains are uncorrelated, then the error in the naive estimate of uncertainty for abundance should be small.

Again, consider a simple example:

##    unit S1 S2
## 1     1  8 12
## 2     2 11  8
## 3     3 11 10
## 4     4 11  8
## 5     5 11  7
## 6     6 10 10
## 7     7  9  8
## 8     8  4  9
## 9     9  9  7
## 10   10  9 13

Here the total number of moose in each domain in each unit varies independently from each other.

The mean count/unit each domains is is 9.3 (SE 0.68) for domain S1, and 9.2 (SE 0.65) for domain S2. The estimated abundance for each domain is 100x the mean count/unit or 930 (SE 68.39) for domain S1 and 920 (SE 64.64) for domain S2. Finally, the grand total estimate is 1850 with a naive standard error found by adding the variances of the two domains and then taking the sqrt of 94.1

However, suppose add the values across the domains in each unit. The mean per unit is now 18.5 (SE 0.83).. The estimated total abundance 1850 is exactly the same as the naive estimator, but the CORRECT standard error (SE 83.33) is now similar than the naive standard error seen previously. [On average, the two standard errors will be similar.]

3.4.4 Implication for moose surveys.

As shown later, the current moose surveys use a domain stratification (in part), but not all survey units are measured on both domains. This would attenuate the correlation in the estimates from the two domains (i.e., pull the correlation in the estimates towards 0) and so we hope that the error in the naive estimate of uncertainty is small, but it cannot be determined in advance.

If the sampling units measured on S2 were selected independently of those selected for S1, there would be no correlation in the estimates, and the naive approach to domain stratification will work perfectly fine.

4 Data structures that allow for both types of stratification

Careful data structures will be needed to ensure that both types of stratification can be dealt with when using MoosePopR_DomStrat() and SightabilityPopR().

We have created an Excel workbook with (fictitious) but realistic data as an illustration of how to analyze a domain stratified survey.

4.1 Stratum Totals

The raw data will need to include both the stratum and domains along with the total number of sampling units and the total area sampled.

If there is no domain stratification, this variable can be set to any arbitrary value, e.g., “ALL”. If there is no regular stratification, this variable can be set to any arbitrary values, e.g. “ALL”, but both domains must have the same value for the stratification.

If you have a stratum/domain that is censused at 100% sampling, it is best to create a “single” sampling unit that represents the entire study area.

Here is the information from the sample Excel file:

#dir(system.file("extdata", package = "SightabilityModel"))

SU.Totals <- readxl::read_excel(system.file("extdata", "ExampleDomainStratification.xlsx", 
                                            package = "SightabilityModel", mustWork=TRUE),
                                  sheet="SU-Totals",
                                  skip=6,.name_repair="universal")
SU.Totals
## # A tibble: 3 × 4
##   Stratum           Domain Total.SU Total.SU.Area.km2
##   <chr>             <chr>     <dbl>             <dbl>
## 1 Uplands           S1          619              3173
## 2 Uplands           S2          619              9056
## 3 RiverBottomCensus ALL           1               233

Note that for domain stratification, the number of sampling units in the population will be the same, but the total sampling area in the two domains may be different.

In this survey, there are two classical strata (the RiverBottomCensus - which was censused 100%; and the Uplands stratum). The Uplands stratum, was measured in two domains, S1 and S2.

4.2 Sampling unit information

For each sampling unit, the classical and domain stratification variable must be listed along with the sampled area for each unit. Note that for domain stratification, the same sampling unit identification may be used (this is helpful to identify which sampling units are measured on both domains). However, the sampling unit ids must be unique across classical strata. The sampling unit id for a 100% sampled (i.e., census) stratum should also be different than other sampling unit ids.

Here are the first few records of the example dataset:

SU.Selected <- readxl::read_excel(system.file("extdata", "ExampleDomainStratification.xlsx", 
                                            package = "SightabilityModel", mustWork=TRUE),
                                  sheet="SU-Selected",
                                  skip=6,.name_repair="universal")

head(SU.Selected)
## # A tibble: 6 × 4
##   Stratum Domain SamplingUnitID SurveyedArea
##   <chr>   <chr>           <dbl>        <dbl>
## 1 Uplands S1                515          6.5
## 2 Uplands S1                464          5.4
## 3 Uplands S1                492          5.1
## 4 Uplands S2                492         11.6
## 5 Uplands S1                374          6.3
## 6 Uplands S2                374         10.5
tail(SU.Selected)
## # A tibble: 6 × 4
##   Stratum           Domain SamplingUnitID SurveyedArea
##   <chr>             <chr>           <dbl>        <dbl>
## 1 Uplands           S2                615          5.3
## 2 Uplands           S1                 86          5.6
## 3 Uplands           S1                265          4.8
## 4 Uplands           S1                564          7.2
## 5 Uplands           S1                572          4.7
## 6 RiverBottomCensus ALL               338        213

Note that sample unit id 492 was sampled in both domains. The census stratum has a “dummy” unique sampling unit id. The area measured for a stratum that is censused, is the area sampled in the sampling unit.

4.3 Way point (group) information

Finally is the information on each group of moose identified on the sampling unit. Again, the stratum, domain, and sampling unit should match the the combination of same in the other tables. Additional covariates should be included that are used in the sightability model.

Here is an example

SU.WayPoint <- readxl::read_excel(system.file("extdata", "ExampleDomainStratification.xlsx", 
                                            package = "SightabilityModel", mustWork=TRUE),
                                  sheet="WayPointData",
                                  skip=6,.name_repair="universal")

if(any(is.na(SU.WayPoint)))stop("Missing values detected in WayPoint data - if null cells are 0, you need to do a replace here")


head(SU.WayPoint[,c("Stratum","Domain","SamplingUnitID","Waypoint","Bulls.Total","Cows.Total","Total.Count")])
## # A tibble: 6 × 7
##   Stratum Domain SamplingUnitID Waypoint Bulls.Total Cows.Total Total.Count
##   <chr>   <chr>           <dbl> <chr>          <dbl>      <dbl>       <dbl>
## 1 Uplands S1                515 364                0          1           1
## 2 Uplands S1                515 369                0          3           3
## 3 Uplands S1                515 368                2          0           2
## 4 Uplands S1                515 367                0          4           6
## 5 Uplands S1                515 366                0          0           0
## 6 Uplands S1                515 363                1          1           2
tail(SU.WayPoint[,c("Stratum","Domain","SamplingUnitID","Waypoint","Bulls.Total","Cows.Total","Total.Count")])
## # A tibble: 6 × 7
##   Stratum           Domain SamplingUnitID Waypoint Bulls.Total Cows.Total Total.Count
##   <chr>             <chr>           <dbl> <chr>          <dbl>      <dbl>       <dbl>
## 1 RiverBottomCensus ALL               338 300                3         17          21
## 2 RiverBottomCensus ALL               338 306                6          2           8
## 3 RiverBottomCensus ALL               338 308                2          1           3
## 4 RiverBottomCensus ALL               338 316                7          9          16
## 5 RiverBottomCensus ALL               338 235                0          5           5
## 6 RiverBottomCensus ALL               338 338                1          1           2

Each waypoint should be unique within the stratum/domain combination. Covariates for the sightability model (e.g., VegCoverClass need to be included here.)

If a sampling unit is surveyed and NO animals are seen, a “dummy” waypoint row should be included with all 0’s for the counts. The covariates for the sightability model can be set to arbitrary values since a count of 0 will be expanded to 0. Here are some such “dummy” waypoints

head(SU.WayPoint[ SU.WayPoint$Total.Count==0,
                  c("Stratum","Domain","SamplingUnitID","Waypoint","Bulls.Total","Cows.Total","Total.Count")])
## # A tibble: 6 × 7
##   Stratum Domain SamplingUnitID Waypoint Bulls.Total Cows.Total Total.Count
##   <chr>   <chr>           <dbl> <chr>          <dbl>      <dbl>       <dbl>
## 1 Uplands S1                515 366                0          0           0
## 2 Uplands S1                492 Inserted           0          0           0
## 3 Uplands S2                492 Inserted           0          0           0
## 4 Uplands S1                374 Inserted           0          0           0
## 5 Uplands S2                374 Inserted           0          0           0
## 6 Uplands S1                 95 Inserted           0          0           0

4.4 Sightability Model

The sightability model used to adjust the counts is also required. This include the “formula” for the model, its estimated coefficients (the beta terms), and the variance-covariance term of the estimated coefficients (the beta.cov terms).

The covariance matrix is given in row major order (https://en.wikipedia.org/wiki/Row-_and_column-major_order), i.e. from left to right and then top to bottom.

This same sightability model is used for all strata/domains.

This model used 5 cover classes and predicts the probability of detection and the sightability correction factor using the model

## ~VegCoverClass

The beta coefficients are:

##   beta0   beta1 
##  4.2138 -1.5847

The covariance matrix of the beta coefficients is:

##            [,1]       [,2]
## [1,]  0.7821634 -0.2820000
## [2,] -0.2820000  0.1114892

5 Example

An example using simulated (but realistic) data will be analyzed.

5.1 Survey Design

This survey is a combination of classical stratification and domain stratification. The two classical strata are:

The RiverBottom stratum is sampled at 100% (i.e., a census). The Upland stratum is sampled using a domain stratification.

Conceptually, this design is:

5.2 Data Values

5.2.1 Stratum Totals

The total number of sampling units and total area for classical and domain strata are:

Total number of sampling units and area in survey region
Stratum Domain Total SU Total Area (km2)
Uplands S1 619 3173
Uplands S2 619 9056
RiverBottomCensus ALL 1 233

Notice that for a complete census, you can “pretend” that there was but a single sampling unit.

5.2.2 Sampling Units Selected

All units were selected in the RiverBottom stratum. A random sample of units was selected from the Upland stratum to have domain S1 measured, and a sub-sample of these units was sampled to have domain S2 measured.

The number of survey units selected in each classical or domain stratum is:

Number of survey units selected in each domain/stratum
Stratum Domain n SU selected Surveyed Area
RiverBottomCensus ALL 1 213.0
Uplands S1 78 431.7
Uplands S2 25 162.6

Note that the sampling units selected in the S2 domain are a sub-sample of the sample units selected in the S1 domain in the Upland stratum.

5.2.3 WayPoint data

During the survey of a sampling unit (domain), groups of moose were seen at various way points. Don’t forget to insert “dummy” waypoint data with counts of 0 for any block that was surveyed, but no animals were seen. Any covariates used in the sightability model can be set to any arbitrary values since an inflation of 0 is still 0.

A summary of the total moose observed by stratum/domain is:

Summary of WayPoint data
Stratum Domain n WayPoints Total Count
RiverBottomCensus ALL 133 299
Uplands S1 144 217
Uplands S2 33 15

5.2.4 Sightability Model

The sightability model used is from Quayle et al (2001). The model coefficients and covariance matrix were extracted from this paper. Note that the sign of the terms reported in Quayle et al (2001) is reversed because Quayle et al (2001) modeled the probability of failure to detect a group.

This model used 5 cover classes and predicts the probability of detection and the sightability correction factor using the model

## ~VegCoverClass

The beta coefficients are:

##   beta0   beta1 
##  4.2138 -1.5847

The covariance matrix of the beta coefficients is:

##            [,1]       [,2]
## [1,]  0.7821634 -0.2820000
## [2,] -0.2820000  0.1114892

These can be used to estimate a sightability correction factor for each Vegetation Cover Class.

Estimated sightability correction factor for each vegetation cover class
Veg Cover Class Detection probability Sightability Correction Factor
1 0.933 1.06
2 0.740 1.33
3 0.368 2.64
4 0.107 8.17
5 0.024 29.08

Notice that the SCF is not simply 1/prob(detect) as explained in Fieberg (2012) and later in this document.

5.3 Uncorrected Estimation

5.3.1 Statistical theory used in the MoosePopR() family

A naive approach is to treat each stratum/domain as a separate “stratum” when using the MoosePopR() function. This has been coded in the MoosePopR_DomStrat() function.

Don’t forget to add “dummy” waypoint data for blocks that were surveyed but no moose groups were seen.

The estimates for each stratum/domain will be unbiased; the estimated total over all strata/domains are also unbiased, but the estimated standard error over all the strata is incorrect because it ignore the positive covariance between estimates in different domains caused by measuring some sampling units in both domains.

5.3.1.1 Density

The estimates of density computed using the MoosePopR() family are a mean-per-area for each stratum:

Suppose a sample of \(n_h\) blocks were surveyed in stratum \(h\) from a population of \(N_h\) blocks. For each block, the number of moose \(m_{hi}\) and the area of the block \(a_{hi}\) are obtained for block \(i\) in stratum \(h\). The estimated density of moose per unit area in stratum \(h\) is obtained as: \[\widehat{D}_h = \frac{\sum_i m_{hi}}{\sum_i a_{hi}}\] i.e., as the total number of moose in the surveyed blocks divided by the total survey area. This is a standard ratio estimator and the estimated standard error is found as using equation 6.9 of Cochran (1977). \[se(\widehat{D}_h)=\sqrt{(1-f_h)\frac{1}{n}\frac{1}{\overline{a_h}^2}\frac{\sum{(m_{hi}-\widehat{D}_{h}\times a_{hi})^2}}{n_h-1}}\] where

  • \(f_h\) is the sampling fraction for that stratum \(f_h = \frac{n_h}{N_h}\)
  • \(\overline{a_h}\) is the mean area per block in stratum \(h\) using either the mean of the observed data or the population mean

The expression in the summation can be expanded into individual parts as needed.

The estimator and its standard error can be automatically computed using the svyratio() function in the survey package (Lumley, 2019) of R.

The overall density for the number of moose is found as a weighted averaged of the stratum specific densities: \[\widehat{D}= \sum_h \frac{A_h}{\sum_h A_h}\widehat{D}_h\]

\[se(\widehat{D}) = \sqrt{\sum_h \left( \frac{A_h}{\sum_h A_h} \right)^2se(\widehat{D}_h)^2}\]

which is equivalent to the estimated total abundance (see below) divided by the total stratum area. This is known as a separate-ratio estimator of the total and is possible because the expansion factor going from density to abundance (the stratum area) is known for each stratum.

5.3.1.2 Abundance

The estimated total number of moose in stratum \(h\) (\(\widehat{M}_h\)) is found by multiplying the density estimate and its standard error by the stratum area \(A_h\)

\[\widehat{M}_h = \widehat{D}_h \times A_h\] \[se(\widehat{M}_h) = se(\widehat{D}_h) \times A_h\]

The total abundance of moose is found by summing the estimated abundances across strata:

\[\widehat{M} = \sum_h \widehat{M}_h\] and the standard error of the overall abundance is found as: \[se(\widehat{M}) = \sqrt{\sum_h se(\widehat{M}_h)^2}\] This is known as a separate-ratio estimator of the total and is possible because the expansion factor going from density to abundance (the stratum area) is known for each stratum.

5.3.1.3 Bull to cow and other ratios

Suppose a sample of \(n_h\) blocks were surveyed in stratum \(h\). For each block, the number of bulls \(b_{hi}\) and the number of cows \(c_{hi}\) are obtained for block \(i\) in stratum \(h\). The estimated bull to cow ratio in stratum \(h\) is obtained as: \[\widehat{BC}_h = \frac{\sum_i b_{hi}}{\sum_i c_{hi}}\] i.e., as the total number of bulls (\(\sum_i b_{hi}\)) in the surveyed blocks divided by the total number of cows (\(\sum_i c_{hi}\)) in the surveyed blocks. This is a standard ratio estimator and the estimated standard error is found as using equation 6.9 of Cochran (1977). The estimator and its standard error are automatically computed using the svyratio() function in the survey package (Lumley, 2019) of R.

If the number of bulls per 100 cows is wanted, simply multiply the estimate and its standard error by \(100\times\).

The estimation of the overall bull to cow ratio is computed using the double-ratio estimate (Cochran, 1977, Section 6.19) as \[\widehat{BC}_{\textit{double~ratio}}=\frac{\widehat{B}}{\widehat{C}}\] where \(\widehat{B}\) and \(\widehat{C}\) are the estimated total number of bull and cows individually computed. This is equivalent to the ratio of the overall density of bulls and cows. The above formula can be expanded to give

\[\widehat{BC}_{\textit{double~ratio}}=\frac{\sum_h \frac{\sum_i b_{hi}}{\sum_i a_{hi}} \times A_h} {\sum_h \frac{\sum_i c_{hi}}{\sum_i a_{hi}} \times A_h}\]

Its standard error is NOT directly available in the survey package but is readily computed by finding the estimates and the covariance of the estimates of number of bull and cows for each stratum and then using the delta method to find the overall ratio and standard error of the overall ratio.

5.3.2 Density estimates uncorrected for sightability

The UNCORRECTED estimates of density using MoosePopR_DomStrat() are:

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, density = ~Total.Count, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated density from MoosePopR_DomStrata - uncorrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Total.Count Block.Area 299 213.0 1.404 0.000
Uplands..S1 Total.Count Block.Area 217 431.7 0.503 0.094
Uplands..S2 Total.Count Block.Area 15 162.6 0.092 0.041
.OVERALL NA NA 531 807.3 0.221 0.038

The standard error of the total over all strata/domains (last line in the tables above) may not be correct since the computations ignore the fact that several sampling units were measured on both domains.

5.3.3 Abundance estimates uncorrected for sightability

The UNCORRECTED estimates of abundance using MoosePopR_DomStrat() are:

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, abundance = ~Total.Count, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance from MoosePopR_DomStrat - uncorrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Total.Count Block.Area 299 213.0 327 0
Uplands..S1 Total.Count Block.Area 217 431.7 1595 299
Uplands..S2 Total.Count Block.Area 15 162.6 835 374
.OVERALL NA NA 531 807.3 2757 479

The standard error of the total over all strata/domains (last line in the tables above) may not be correct since the computations ignore the fact that several sampling units were measured on both domains.

5.3.4 Estimate impact of correlation in estimates from two domains using a resampling approach

A correlation in the measurement on the two domains in some survey unit will cause estimate of density/abundance in the two domains to be correlated and so the estimated SE for the overall study area may not be accurate. The measurements in the two domains are positively/negatively correlated, then the estimated SE for the entire study area will be under/over estimated.

Here is a plot of the Total.Count observed on the S2 domain compared to the Total.Count on the S1 domain for those units when both domains are measured.

There is a modest correlation between the two counts, but only about 1/3 of the sampled survey units had both domains measured so the overall correlation in the estimates of density/abundance will be attenuated towards 0.

It is very difficult to determine, analytically, the extent of the bias in the SE of the estimates for the entire study area. Consequently, a resampling (bootstrap) approach will be used.

For each bootstrap estimate, the survey units where only the S1 domain was measured will be resampled; similarly the survey units where both domains are measured will also be resampled. The observed waypoint data will not be resampled within the selected survey units. It will be convenient to summarize the waypoint data to a single record for each survey unit prior to resampling.

The correlation among the estimates of density for the strata/domains are:

There is a mild correlation between the estimates of density in the two domain strata and the correlation between the estimates of density is attenuated towards 0 compared to the correlation in the raw count data.

Also notice that the estimate for the RiverBottom stratum with 100% sampling is fixed over all bootstrap samples.

The mean and sd of the bootstrap estimates of density, and the mean estimated standard error are:

Results from bootstrapping estimates of density
Num bootstrap samples Mean density estimate SD of density estimates Mean SE of density estimates Inflation due to domain stratification
500 0.222 0.039 0.037 1.05

We see that the actual SE for estimates of density needs to be inflated by about 5% to deal with the correlation in the strata/domain estimates caused by measuring sampling units in both domains.

A similar result will be obtained when looking at estimates of abundance. The correlation among the estimates of abundance for each stratum/domain are:

There appears to be a mild correlation among the estimates from the domain/strata and again the correlation between the estimates of abundance is attenuated towards 0 compared to the correlation in the raw count data.

We look at the mean and sd of the estimates of abundance and the mean reported SE for the estimates of abundance:

Results from bootstrapping estimates of abundance
Num bootstrap samples Mean density estimate SD of density estimates Mean SE of density estimates Inflation due to domain stratification
500 2745.3 501.1 462.4 1.08

We see that the actual SE for estimates of abundance needs to be inflated by about 8% to deal with the correlation in the strata/domain estimates caused by measuring sampling units in both domains.

5.3.4.1 Alternate estimator for S2 domain.

The estimator for density/abundance for the S2 domain ignores the fact that some sampling units were measured on both domains. This turns the design into a two-phase design (the sampling units selected for measurements on the S1 domain are sub-sampled and the sub-sample is also measured on the S2 domain.) There are several possible estimators that look at the relationship between the S1 and S2 measurements and uses that relationship to apply to the estimated density/abundance for the S1 domain. For example, suppose the S2 measurements are about 1/2 of the S1 measurement. Then a “better” (i.e., lower standard errors) estimator of the density/abundance for the S2 domain would to take the estimated density/abundance for the S1 domain, and multiply it by 1/2.

This has not yet been investigated

5.3.5 Bull:Cow ratio uncorrected for sightability

5.3.5.1 Number of bulls

We can estimate the number of bulls:

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, abundance = ~Bulls.Total, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated number bulls from MoosePopR_DomStrat - uncorrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Bulls.Total Block.Area 94 213.0 103 0
Uplands..S1 Bulls.Total Block.Area 74 431.7 544 113
Uplands..S2 Bulls.Total Block.Area 2 162.6 111 73
.OVERALL NA NA 170 807.3 758 135

5.3.5.2 Number of cows

We can estimate the number of cows:

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, abundance = ~Cows.Total, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated number cows from MoosePopR_DomStrat - uncorrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Cows.Total Block.Area 177 213.0 194 0
Uplands..S1 Cows.Total Block.Area 111 431.7 816 159
Uplands..S2 Cows.Total Block.Area 12 162.6 668 358
.OVERALL NA NA 300 807.3 1678 392

5.3.5.3 Bull per 100 cows ratio

We can use the above values to “manually” compute the bulls per 100 cows ratio. This gives a bull:100 cows ratio of 45.18, but no standard error.

But the ratio (and the standard error) can be found directly using the the MoosePopR() family of functions and gives:

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, numerator = ~Bulls.100.Total, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated bulls to 100 cows from MoosePopR_DomStrat - uncorrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Bulls.100.Total Cows.Total 9400 177 53 0
Uplands..S1 Bulls.100.Total Cows.Total 7400 111 67 11
Uplands..S2 Bulls.100.Total Cows.Total 200 12 17 13
.OVERALL NA NA 17000 300 45 11

Again, because some sample units are measured on both domains, the reported SE may not be correct but now this is difficult to diagnose because not only do you have both domains measured on several sample units, you also have both bulls and cows measured in each domain as well. Bootstrapping may be the only option here to investigate the amount of bias in the reported SE for the overall study area.

5.4 Corrected for sightability estimation

5.4.1 Statistical theory used in the SightabilityPopR() family

As before, a naive approach is to treat each stratum/domain as a separate “stratum” when using the SightabilityPopR() family of functions.

Don’t forget to add “dummy” waypoint data for blocks that were surveyed but no moose groups were seen.

The estimates for each stratum/domain will be unbiased; the estimated total over all strata/domains are also unbiased, but the estimated standard error over all the strata is incorrect because it ignore the correlation between estimates in different domains caused by measuring some sampling units in both domains.

5.4.1.1 Abundance

The estimates of density computed using the SightabilityPopR() family of functions are a mean-per-unit estimator for each stratum, but corrected for sightability. The mean-per-unit estimator used by SightabilityPopR() family differs from the mean-per-area estimator used by the MoosePopR() family which can lead to cases where the estimated density/abundance from the SightabilityPopR() family (after correcting for sightability) is less than that estimated by the MoosePopR() family (see below) . Suppose a sample of \(n_h\) blocks were surveyed in stratum \(h\) out of \(N_h\) total number of blocks. For each block, the number of moose \(m_{hjk}\) are obtained for group \(k\) in block \(j\) in stratum \(h\). The estimated correction factor (for visibility of the group) is \(\widehat{\theta}_{hjk}\) which is computed give the vector of covariates (\(x_{hjk}\)) as (see page 4 of Fieberg, 2012)

\[\widehat{\theta}_{hjk}= 1+\exp{(-x_{hjk}^t \widehat{\beta}-x_{hjk}^t \Sigma x_{hjk} / 2)}\] Notice that the sightability correction factor is not simply \(1/\textit{probability of detection}\) or \(1+\exp{(-x_{hjk}^t \widehat{\beta})}\) because it includes a correction factor to account for the non-linear transformation from the logit to the probability scale and the uncertainty in the beta coefficients.

The estimated abundance of moose in stratum \(h\) is obtained as: \[\widehat{M}_h = N_h \times \frac{\sum_{jk} m_{hjk}\widehat{\theta}_{hjk}}{n_h}\] i.e., as the mean number of (sightability corrected) moose per surveyed block times the total number of blocks. The standard error of this estimate is complicated because of the need to account for the estimated sightability but the theory is covered in Steinhorst and Samuel (1989), Fieberg (2012) and Wong (1996) and implemented in the AerialSurvey (Unsworth, 1999) and SightabilityModel packages of R. The latter package corrects the variance computations used by Steinhorst and Samuel (1989) and Fieberg (2012).

The total abundance of moose is found by summing the estimated abundances across strata/domains:

\[\widehat{M} = \sum_h \widehat{M}_h\] The standard error is difficult to compute because the same sightability model is used across strata. This has been implemented in the SightabilityModel package (Fieberg, 2012) and the AerialSurvey (Unsworth, 1999) program.

Again, note that the SE will not have been adjusted for the correlation in the estimates across domains that are measured on the same sampling unit.

5.4.1.2 Density

The estimated density of moose in stratum \(h\) (\(\widehat{D}_h\)) is found by dividing the abundance estimate and its standard error by the stratum area \(A_h\)

\[\widehat{D}_h = \frac{\widehat{M{_h}}}{A_h} \] \[se(\widehat{M}_h) = \frac{se(\widehat{M}_h)}{A_h}\]

The overall density of moose and its standard error is found as estimated total abundance and its standard error divided by the total area over all strata.. \[\widehat{D}= \frac{ \widehat{M}}{\sum_h A_h}\] \[se(\widehat{D}) = \frac{se(\widehat{M})}{{\sum_h A_h}}\] Because the total area of all strata is known quantity, this is straightforward.

Again, note that the SE will not have been adjusted for the correlation in the estimates across domains that are measured on the same sampling unit.

5.4.1.3 Bull to Cow and other ratios

The ratio of bulls to cows within a stratum is computed as the estimated abundances in each stratum: \[\widehat{BC}_h = \frac{\widehat{Bulls}_h}{\widehat{Cows}_h}\] i.e., as the estimated total number of bulls in the stratum divided by the estimated total number of cows in the stratum. The standard error is difficult to compute because of the need to account for the correlation between the number of bulls and moose in individual groups and the the uncertainty of the correction factors but is given in Wong (1996).

Note that if the sightability is equal for all groups, then this common sightability correction factor would cancel everywhere and you would get the same results as the ratio estimator based on the observed groups only. In most cases, the impact of sightability will be small on the ratio estimator and its standard error.

If the number of bulls per 100 cows is wanted, simply multiply the estimate and its standard error by \(100\times\).

The estimation of the overall bull to cow ratio is computed as the ratios of their respective estimated abundances \[\widehat{BC}=\frac{\widehat{Bulls}}{\widehat{Cows}}\] Standard errors are computed as described in Wong (1996).

Again, note that the SE will not have been adjusted for the correlation in the estimates across domains that are measured on the same sampling unit.

5.4.2 Density estimates corrected for sightability.

The CORRECTED estimates of density using SightabilityPopR_DomStrat() are:

## Warning in SightabilityPopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated density from SightabilityPopR_DomStrat - corrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE SCF
RiverBottomCensus..ALL Total.Count NA 299 NA 1.415 0.074 1.01
Uplands..S1 Total.Count NA 217 NA 0.617 0.116 1.23
Uplands..S2 Total.Count NA 15 NA 0.046 0.020 0.50
.OVERALL NA NA 531 NA 0.217 0.033 0.98

5.4.3 Abundance estimates corrected for sightability.

The UNCORRECTED estimates of abundance using SightabilityPopR_DomStrat() are:

## Warning in SightabilityPopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance from SightabilityPopR_DomStrat - corrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE SCF
RiverBottomCensus..ALL Total.Count NA 299 NA 330 17 1.01
Uplands..S1 Total.Count NA 217 NA 1958 368 1.23
Uplands..S2 Total.Count NA 15 NA 414 179 0.50
.OVERALL NA NA 531 NA 2702 416 0.98

In all cases, the standard error of the total over all strata/domains (last line in the tables above) may not be correct since the computations ignore the fact that several sampling units were measured on both domains.

5.4.4 Bull:Cow ratio corrected for sightability

5.4.4.1 Number of bulls

We can estimate the number of bulls:

## Warning in SightabilityPopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated bulls from SightabilityPopR_DomStrat - corrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Bulls.Total NA 94 NA 104 7
Uplands..S1 Bulls.Total NA 74 NA 682 144
Uplands..S2 Bulls.Total NA 2 NA 53 36
.OVERALL NA NA 170 NA 838 150

5.4.4.2 Number of cows

We can estimate the number of cows:

## Warning in SightabilityPopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated cows from SightabilityPopR_DomStrat - corrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Cows.Total NA 177 NA 195 12
Uplands..S1 Cows.Total NA 111 NA 998 200
Uplands..S2 Cows.Total NA 12 NA 336 172
.OVERALL NA NA 300 NA 1529 268

5.4.4.3 Bulls per 100 cows ratio

We can compute the bulls to 100 cows ratio “manually”. This gives a bull:100 cows ratio of 54.84, but no standard error.

The SightabilityPopR_DomStrat() function can compute this directly and gives:

Estimated bulls to 100 cows from SightabilityPopR_DomStrat - corrected for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE SCF
RiverBottomCensus..ALL Bulls.100.Total Cows.Total 9400 177 53 4 1.00
Uplands..S1 Bulls.100.Total Cows.Total 7400 111 68 12 1.03
Uplands..S2 Bulls.100.Total Cows.Total 200 12 16 13 0.94
.OVERALL NA NA 17000 300 55 10 1.21

Again, because some sample units are measured on both domains, the reported SE may not be correct but now this is difficult to diagnose because not only do you have both domains measured on several sample units, you also have both bulls and cows measured in each domain as well. Bootstrapping may be the only option here to investigate the amount of bias in the reported SE for the overall study area.

Notice that the sightability correction factors will usually be close to 1 for these types of ratios because when moose appear in groups, the same sightability correction factor would apply to both the bulls

5.5 SCF < 1 - Why does this happen?

Note that some SCF appear to be < 1 in some strata/domain. This would seem to be logically impossible, but is an artefact of the different ways that abundance/density are computed in the MoosePopR() and SightabilityPopR() family of functions.

In the MoosePopR family, abundance is computed using a mean-per-area estimator based on the number of animals per km\(^2\) measured over all survey units in a strata/domain expanded by the total survey area.

In the SightabilityPopR() family, the mean-per-survey unit (regardless of area) is expanded by the number of survey units in the domain/stratum. So some large/small areas measured in some survey units with large/small number of moose can lead to quite different estimates of abundance.

For example, consider the data from the S2 domain with the SCF factor included

##     Total.Count VegCoverClass    detect      SCF
## 278           0             1 0.9327111 1.061181
## 279           2             2 0.7396981 1.334720
## 280           0             1 0.9327111 1.061181
## 281           0             1 0.9327111 1.061181
## 282           0             1 0.9327111 1.061181
## 283           0             1 0.9327111 1.061181
## 284           0             1 0.9327111 1.061181
## 285           0             1 0.9327111 1.061181
## 286           0             2 0.7396981 1.334720
## 287           2             1 0.9327111 1.061181
## 288           1             1 0.9327111 1.061181
## 289           0             1 0.9327111 1.061181
## 290           0             1 0.9327111 1.061181
## 291           0             1 0.9327111 1.061181
## 292           0             1 0.9327111 1.061181
## 293           0             1 0.9327111 1.061181
## 294           0             1 0.9327111 1.061181
## 295           0             1 0.9327111 1.061181
## 296           0             2 0.7396981 1.334720
## 297           0             2 0.7396981 1.334720
## 298           1             1 0.9327111 1.061181
## 299           0             2 0.7396981 1.334720
## 300           3             1 0.9327111 1.061181
## 301           1             1 0.9327111 1.061181
## 302           1             1 0.9327111 1.061181
## 303           1             1 0.9327111 1.061181
## 304           1             1 0.9327111 1.061181
## 305           1             1 0.9327111 1.061181
## 306           0             1 0.9327111 1.061181
## 307           0             1 0.9327111 1.061181
## 308           0             1 0.9327111 1.061181
## 309           1             2 0.7396981 1.334720
## 310           0             1 0.9327111 1.061181

Notice that the SCF is not simply \(1/\textit{probability of detection}\) as noted previously.

The following statistics are computed:

*** Uncorrected estimates using mean-per-area estimator 
    Total a nimals seen                    :  15 
    Total area measured                    :  162.6 
    Density - uncorrected - mean-per-area  :  0.09225092 
    Domain area                            :  9056 
    Abundance -uncorrected -mean-per-area  :  835.4244 


*** Corrected for sightability estimates using mean-per-unit estimator 
    Total animals seen (corrected)        :  16.73833 
    Number of sampling units              :  25 
    Mean corrected count/SFU              :  0.6695331 
    Number of SU in Domain                :  619 
    Abundance - corrected - mean-per-unit :  414.441 
    Domain area                           :  9056 
    Density - corrected - mean-per-unit   :  0.04576424 
    SCF - Density                         :  0.4960844 
    SCF - Abundance                       :  0.4960844 


*** Corrected for sightability estimates  using a mean-per-area estimator 
    Total animals seen (corrected)       :  16.73833 
    Total area measured                  :  162.6 
    Density - corrected - mean-per-area  :  0.1029417 
    Domain area                          :  9056 
    Abundance -corrected - mean-per-area :  932.2404 
    SCF - Density                        :  1.115888 
    SCF - Abundance                      :  1.115888 

We see that the estimate of density and abundance are SMALLER when computed using the mean-per-unit estimator (SightabilityPopR() method) than when using the mean-per-area estimator (MoosePopR() method). The sightability factors are <1 (!) which seems counter-intuitive.

This is an artefact of the data. If all the sampling units had the same measured area, the two estimators would be identical and the problem would not exist.

You could use the sightability corrected counts and compute a mean-per-area estimator for the density as well (third set of values above). Now the SCF are all >1.

It is not possible to compute an abundance estimator using a mean-per-area estimator using the SightabilityPopR() family of functions because it will try to expand the area of abundance by the sightability correction factor as well. Bummer.

5.6 Adjusting data for SCF using MoosePopR()

However, an approximation to the mean-per-area estimator using the sightability corrected counts can be done by MANUALLY inflating the response variable by the SCF for each group, and passing the corrected counts to the MoosePopR() family of functions. This will give the correct point estimates, but the SE will be incorrect because it has not accounted for neither the multiple measurements on the same sampling unit, nor for the uncertainty in the SCF (the latter is expected to be small).

The density estimates CORRECTED for the SCF
using a mean-per-area estimator using MoosePopR_DomStrat() are:

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, density = ~Total.Count.C, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated density from MoosePopR_DomStrat - corrected for sightability by manually inflating counts by SCF
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE SCF
RiverBottomCensus..ALL Total.Count.C Block.Area 329.6 213.0 1.548 0.000 1.10
Uplands..S1 Total.Count.C Block.Area 246.7 431.7 0.571 0.105 1.14
Uplands..S2 Total.Count.C Block.Area 16.7 162.6 0.103 0.045 1.12
.OVERALL NA NA 593.0 807.3 0.249 0.042 1.13

The abundance estimates CORRECTED for the SCF using a mean-per-area estimator using MoosePopR_DomStrat() are:

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, abundance = ~Total.Count.C, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance from MoosePopR_DomStrat - corrected for sightability by manually inflating counts by SCF
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE SCF
RiverBottomCensus..ALL Total.Count.C Block.Area 329.6 213.0 361 0 1.10
Uplands..S1 Total.Count.C Block.Area 246.7 431.7 1813 332 1.14
Uplands..S2 Total.Count.C Block.Area 16.7 162.6 932 407 1.12
.OVERALL NA NA 593.0 807.3 3106 526 1.13

Now all of the SCF are greater than 1.

We can get an estimate of the revised SE that accounts for both the domain stratification and and the uncertainty in the sightability correction factors using bootstrapping in two steps.

First the correction to the SE for domain stratification only, uses the manually adjusted response variable and no sightability model.

set.seed(2343243)
bootres <- plyr::rlply(n.boot, 
                       MoosePopR_DomStrat_bootrep(
                          SU.Totals, 
                          SU.Selected, 
                          SU.WayPoint,
                          density=NULL, 
                          abundance=~Total.Count.C, 
                          numerator=NULL, 
                          denominator=NULL,
                          
                          stratum.total.blocks.var= "Total.SU"  # total number of blocks in the stratum      
                         ,stratum.total.area.var  ="Total.SU.Area.km2" # total area of blocks in the stratum
                         ,block.id.var            ="SamplingUnitID"
                         ,block.area.var          ="SurveyedArea")

         )
Results from bootstrapping estimates of abundance with manual adjustment for sightability
Num bootstrap samples Mean abundance estimate SD of abundance estimates Mean SE of abundance estimates Inflation due to domain stratification
500 3132.654 571.451 514.245 1.11

We see that the actual SE for estimates of abundance needs to be inflated by about 11% to deal with the correlation in the strata/domain estimates caused by measuring sampling units in both domains.

We now bootstrap both the domain stratification and the sightability correction factor by passing the sightability function estimates and the original data. The sightability function coefficients are varied according to the covariance matrix for the estimated beta terms in the sightability function.

set.seed(2343243) # use the same seed so that we get the same simulated data
bootres <- plyr::rlply(n.boot, 
                       MoosePopR_DomStrat_bootrep(
                         SU.Totals, 
                         SU.Selected, 
                         SU.WayPoint,
                         sight.model=sightability.model,
                         sight.beta =sightability.beta,
                         sight.beta.cov=sightability.beta.cov,
                         density=NULL, 
                         abundance=~Total.Count, 
                         numerator=NULL, 
                         denominator=NULL,
                         stratum.total.blocks.var= "Total.SU"  # total number of blocks in the stratum      
                        ,stratum.total.area.var  ="Total.SU.Area.km2" # total area of blocks in the stratum
                        ,block.id.var            ="SamplingUnitID"
                        ,block.area.var          ="SurveyedArea")

         )
Results from bootstrapping estimates of abundance when sightability model is also bootstrapped
Num bootstrap samples Mean abundance estimate SD of abundance estimates Mean SE of abundance estimates Inflation due to domain stratification
500 3175.457 597.293 523.288 1.14

We see that the actual SE for estimates of abundance needs to be inflated by about 14% to deal with domain stratification and with the uncertainty in the sightability model. The additional impact of the uncertainty in the sightability model on the SE for the estimated abundance is small.

6 Final estimates

Here are the final estimates of density and abundance for all moose, and the bull:100 cows ratio estimated using MoosePopR_DomStrat() (mean per area estimator) with appropriate corrections for multiple measurements on the same sampling units and the uncertainty in the sightability model.

In each section, we present the estimate/se without corrections for the sightability, the mean sightability correction factor, the corrected estimate/se after applying the sightability correction factor, and finally, the corrected SE accounting for the uncertainty in the sightability model and for the multiple measurements in multiple domains on the same unit.

6.1 Abundance of all moose - corrected for sightability and domain stratification

Estimated abundance using per-area (MoosePopR), corrected for sightability, and domain stratification
Uncorrected for SCF
Apply SCF
Corrected for SCF and domain
StratumDomain Est SE SCF Est SE SE inflate SE Conf level LCL UCL
.OVERALL 2757 479 1.13 3106 526 1.12 590 0.9 2135 4077
RiverBottomCensus..ALL 327 0 1.10 361 0 Inf 16 0.9 334 387
Uplands..S1 1595 299 1.14 1813 332 1.11 368 0.9 1208 2419
Uplands..S2 835 374 1.12 932 407 1.10 446 0.9 198 1666

6.2 Density of all moose - corrected for sightability and domain stratification

Estimated density using per-area (MoosePopR), corrected for sightability, and domain stratification
Uncorrected for SCF
Apply SCF
Corrected for SCF and domain
StratumDomain Est SE SCF Est SE SE inflate SE Conf level LCL UCL
.OVERALL 0 0 1.13 0 0 1.21 0 0.9 0 0
RiverBottomCensus..ALL 1 0 1.10 2 0 Inf 0 0.9 1 2
Uplands..S1 1 0 1.14 1 0 1.16 0 0.9 0 1
Uplands..S2 0 0 1.12 0 0 1.15 0 0.9 0 0

6.3 Bulls:100 Cows - corrected for sightability and domain stratification

Estimated bull.100 cows using per-area (MoosePopR), corrected for sightability, and domain stratification
Uncorrected for SCF
Apply SCF
Corrected for SCF and domain
StratumDomain Est SE SCF Est SE SE inflate SE Conf level LCL UCL
.OVERALL 45 11 1.01 46 11 0.95 11 0.9 28 63
RiverBottomCensus..ALL 53 0 1.00 53 0 Inf 0 0.9 53 54
Uplands..S1 67 11 1.03 68 12 1.08 13 0.9 47 89
Uplands..S2 17 13 0.94 16 13 2.01 25 0.9 -26 57

7 Summary and Recommendations

7.1 Survey Design

Both regular and domain stratification can be dealt with using the MoosePopR_DomStrat() and SightabilityPopR_DomStrat() functions which create “strata” based on a combination of stratum and domain values. However, the SE of the overall estimate does not account for the multiple domains measured on some sampling units – a bootstrap approach may be needed.

When using domain stratification, randomly select sampling units from which all domains are measured from ALL blacks (including those with no second domain).

Normally, the units that are sampled on the second domain, are a sub-sample from the sampling units selected to measure the first domain. However, you could randomly sample a set of units to measure the second domain independently of those measured for the first domain. This design may be a bit more costly to implement but then you avoid the problem of correlation in the estimates. Similarly, you could have sampling units measured on the second domain and not the first domain.

7.2 Data structures

All data table need to include variables to identify the regular and domain stratification that took place. A new Stratum.Domain variable that is a combination of the regular and domain strata would be used as the “stratum” variable in the analyses.

Sampling unit ids should be unique across regular strata, but the same across domain strata.

7.3 Include schematic of design in report

All reports should include a schematic of the sampling design so that the distinction between regular and domain stratification is clear.

7.4 Analysis

Here is a summary of the capabilities of MoosePopR_DomStrat() and SightabilityPopR_DomStrat()

Summary of capababilities of MoosePopR_DomStrat and SightabilityPopR_DomStrat
Estimator Item MoosePopR_DomStrat SightabilityPopR_DomStrat
Mean-per-unit
No correction for sightability yes-a yes-b
Correction for sightability yes-a, c yes
Regular or Domain stratification yes - a, e yes - e
Mean-per-area
No correction for sightability yes - a no
Correction for sightability yes - c, d no
Regular or Domain stratification yes - e yes - e
a Set area to 1; set total area to total number of survey units
b Set sightability model to ~1; set beta coefficient to 15; set beta vcov to 0
c SE will not include component due to uncertainty in sightability model
d Manually adjust response variable for SCF
e SE will not account for repeated sampling of same unit in Domain stratification

The MoosePopR() and SightabilityPopR() family of functions use different way to estimate density (mean-per-area vs. mean-per-unit) If all the areas surveyed in the survey units are the same, the two methods give identical results.

Examples of the output from the two functions follow.

7.4.1 Mean-per-unit estimates

For example, here are outputs from the mean-per-unit estimators for the total number of moose.

7.4.1.1 MoosePopR_DomStrat - no correction for sightability

The following table is the estimated abundance using a per-unit estimator without correction for sightability from MoosePopR_DomStrat().

## Warning in MoosePopR_DomStrat(temp.SU.Totals, temp.SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance - MoosePopR_DomStrat - mean-per-unit - no correction for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Total.Count Block.Area 299 1 299 0
Uplands..S1 Total.Count Block.Area 217 78 1722 320
Uplands..S2 Total.Count Block.Area 15 25 371 164
.OVERALL NA NA 531 104 2392 360

7.4.1.2 SightabilityPopR - no correction for sightability

The following table is the estimated abundance using a per-unit estimator without correction for sightability from SightabilityPopR().

## Warning in SightabilityPopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance - SightabilityPopR - mean-per-unit - no correction for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Total.Count NA 299 NA 299 0
Uplands..S1 Total.Count NA 217 NA 1722 320
Uplands..S2 Total.Count NA 15 NA 371 164
.OVERALL NA NA 531 NA 2393 360

7.4.1.3 MoosePopR_DomStrat - correction for sightability

The following table is the estimated abundance using a per-unit estimator with correction for sightability from MoosePopR_DomStrat().

## Warning in MoosePopR_DomStrat(temp.SU.Totals, temp.SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance - MoosePopR_DomStrat - mean-per-unit - correction for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Total.Count.C Block.Area 329.6 1 330 0
Uplands..S1 Total.Count.C Block.Area 246.7 78 1958 356
Uplands..S2 Total.Count.C Block.Area 16.7 25 414 178
.OVERALL NA NA 593.0 104 2702 398

The SE is not adjusted for the uncertainty in the correction model.

7.4.1.4 SightabilityPopR - correction for sightability

The following table is the estimated abundance using a per-unit estimator with correction for sightability from SightabilityPopR().

## Warning in SightabilityPopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance - SightabilityPopR - mean-per-unit - correction for sightability
Components of variance in estimates
Stratum Domain Num Var Var 1 total Est SE Total Sampling Sightability Model
RiverBottomCensus..ALL Total.Count 299 330 17 301 0 176 125
Uplands..S1 Total.Count 217 1958 368 135345 119859 7859 7627
Uplands..S2 Total.Count 15 414 179 32033 29629 2120 285
.OVERALL NA 531 2702 416 172960 149488 10155 13317

Notice that the point estimates from the mean-per-unit estimators when/when not corrected for sightability are the same between the MoosePopR() and SightabilityPopR() families (as they must be). The computed SE are different from the two functions when adjusting for sightability because the MoosePopR() family has no mechanism for accounting for the uncertainty in the beta estimates.

However, notice from the output from SightabilityPopR_DomStrat() that the largest component of variance in the estimates comes from the sampling variance and so the underreporting by MoosePopR_DomStrat() is small: 398 vs 416.

7.4.2 Mean-per-area estimates

An alternate estimator is the mean-per-area estimator which accounts for the different areas measured on the sampling unit. As noted in the Vignettes that ship with the SightabilityModel package, the mean-per-area estimators are expected to have better precision if there is moderately large correlation between the count and the area, i.e., larger areas then to have larger counts and vice versa.

It is not possible to compute the mean-per-area estimators with the SightabilityPopR() function.

7.4.2.1 MoosePopR_DomStrat - no correction for sightability

The following table is the estimated abundance using a per-area estimator without correction for sightability from MoosePopR_DomStrat().

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, abundance = ~Total.Count, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated abundance - MoosePopR_DomStrat - mean-per-unit - no correction for sightability
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Total.Count Block.Area 299 213.0 327 0
Uplands..S1 Total.Count Block.Area 217 431.7 1595 299
Uplands..S2 Total.Count Block.Area 15 162.6 835 374
.OVERALL NA NA 531 807.3 2757 479

7.4.2.2 MoosePopR_DomStrat - correction for sightability

The following table is the estimated abundance using a per-area estimator with correction for sightability from MoosePopR_DomStrat().

## Warning in MoosePopR_DomStrat(SU.Totals, SU.Selected, SU.WayPoint, abundance = ~Total.Count.C, : SE not adjusted for measurements on multiple domains on same sampling unit
Estimated density from MoosePopR_DomStrat - corrected for sightability by manually inflating counts by SCF
Stratum Domain Num Var Denom Var Var 1 total Var 2 total Est SE
RiverBottomCensus..ALL Total.Count.C Block.Area 329.6 213.0 361 0
Uplands..S1 Total.Count.C Block.Area 246.7 431.7 1813 332
Uplands..S2 Total.Count.C Block.Area 16.7 162.6 932 407
.OVERALL NA NA 593.0 807.3 3106 526

The SE for the mean-per-area corrected for sightability will be too small because the MoosePopR() family cannot adjust for uncertainty in the sightability estimates, but it is assumed that any adjustment will again be small.

7.4.2.3 SightabilityPopR - no correction for sightability

Not possible.

7.4.2.4 SightabilityPopR - correction for sightability

Not possible.

7.4.3 Regular and domain stratification

The analysis of moose population surveys that include both regular and domain stratification can be done using the MoosePopR_DomStrat() and SightabilityPopR() functions. Point estimates will be correct, but if domain stratification is used, the reported standard error for the overall density/abundance/ratios may not be correct because these do not account for the correlation in the estimates across domains caused by the same sampling units measured on multiple domains.

The advice of Heard et al (2001) on how to compute the variance of the overall estimator:

The total population estimate was then calculated as the sum of the corrected stratum- specific population estimates and its variance was the sum of the 2 stratum-specific variances. Overall population density was obtained by dividing the total population estimate by the area of both strata combined.

is incorrect.

Bootstrapping could be used to find the appropriate standard errors that account for the domain stratification.

7.4.4 Computing the sightability correction factor

The SCF in SightabilityPopR() is NOT computed as \(1/\textit{probability of detection}\); see Fieberg (2012) for details.

7.4.5 What to do in practice

I would recommend to start with the MoosePopR_DomStrat() function without corrections for sightability because most surveys will not have equal areas surveyed in all blocks.

Manually correct the response variables for the Sightability Correction Factor (SCF) to get estimates corrected for sightability, but be aware that the SE will be underreported. To get a “feel” for the amount of underreporting, examine the output from the SightabilityPopR() function to see what portion of the variance of the estimate is attributable to the sightability correction. Alternatively, a bootstrap approach can be used to estimate the inflation factor for the SE to account for the uncertainty in the sightability model. Future revisions to the SightabilityModel package may incorporate sightability corrections to MoosePopR_DomStrat() so this step may be redundant when these revisions are done.

Both regular and domain stratification can be used with both functions. SE under regular stratification are correct; SE from domain stratification may not be correct because of the multiple readings on some survey units. Unfortunately, it is not possible to “program” automatic detection and correction of SE for domain stratification. Bootstrapping may be advisable to correct the SE.

8 References

British Columbia Resources Information Standards Committee (BC RISC). 2002. Aerial-based inventory methods for selected ungulates: bison, mountain goat, mountain sheep, moose, elk, deer and caribou. Standards for Components of British Columbia’s Biodiversity No. 32. Version 2, BC Ministry of Sustainable Resource Management, Victoria, BC. 91 pp.

Gasaway, W.C., S.D. Dubois, D.J. Reed, and S.J. Harbo. 1986. Estimating moose population parameters from aerial surveys. Biological Papers No 22, University of Alaska, Fairbanks, Alaska.

Heard, D.C. A.B.D. Walker, J.B. Ayotte, and G.S. Watts. 2008. Using GIS to modify a stratified random block survey design for moose. Alces 44: 11-116.

Quayle, J.F. A.G. MacHutchon, and D.N. Jury. 2001. Modelling moose sightability in south- central British Columbia. Alces 37: 43-54.

Fieberg, J.R. (2012). Estimating Population Abundance Using Sightability Models: R SightabilityModel Package. Journal of Statistical Software, 51(9), 1-20. https://doi.org/10.18637/jss.v051.i09