% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/relative_site_uncertainty_scores.R
\name{relative_site_uncertainty_scores}
\alias{relative_site_uncertainty_scores}
\title{Relative site uncertainty scores}
\usage{
relative_site_uncertainty_scores(site_data, site_probability_columns)
}
\arguments{
\item{site_data}{\code{\link[sf:sf]{sf::sf()}} object with site data.}

\item{site_probability_columns}{\code{character} names of \code{numeric}
columns in the argument to \code{site_data} that contain modelled
probabilities of occupancy for each feature in each site.
Each column should correspond to a different feature, and contain
probability data (values between zero and one). No missing (\code{NA})
values are permitted in these columns.}
}
\value{
\code{numeric} \code{vector} of uncertainty scores. Note that
these values are automatically rescaled between 0.01 and 1.
}
\description{
Calculate scores to describe the overall uncertainty of modelled species'
occupancy predictions for each site. Sites with greater scores are associated
with greater uncertainty. Note that these scores are relative to each other
and uncertainty values calculated using different matrices cannot be
compared to each other.
}
\details{
The relative site uncertainty scores are calculated as joint Shannon's
entropy statistics. Since we assume that species occur independently of each
other, we can calculate these statistics separately for each species in each
site and then sum together the statistics for species in the same site:

\enumerate{
\item Let \eqn{J} denote the set of sites (indexed by \eqn{j}),
\eqn{I} denote the set of features (indexed by \eqn{i}), and
\eqn{x_{ij}} denote the modelled probability of feature \eqn{i \in I}
occurring in sites \eqn{j \in J}.

\item Next, we will calculate the Shannon's entropy statistic for each
species in each site:
\eqn{y_{ij} = - \big( (x_ij \mathit{log}_2 x_{ij}) + (1 - x_ij \mathit{log}_2 1 - x_{ij}) \big) }

\item Finally, we will sum the entropy statistics together for each site:
\eqn{s_{j} = \sum_{i \in I} y_{ij}}

}
}
\examples{
# set seed for reproducibility
set.seed(123)

# simulate data for 3 features and 5 sites
x <- tibble::tibble(x = rnorm(5), y = rnorm(5),
                    p1 = c(0.5, 0, 1, 0, 1),
                    p2 = c(0.5, 0.5, 1, 0, 1),
                    p3 = c(0.5, 0.5, 0.5, 0, 1))
x <- sf::st_as_sf(x, coords = c("x", "y"))

# print data,
# we can see that site (row) 3 has the least certain predictions
# because it has many values close to 0.5
print(x)

# plot sites' occupancy probabilities
plot(x[, c("p1", "p2", "p3")], pch = 16, cex = 3)

# calculate scores
s <- relative_site_uncertainty_scores(x, c("p1", "p2", "p3"))

# print scores,
# we can see that site 3 has the highest uncertainty score
print(s)

# plot sites' uncertainty scores
x$s <- s
plot(x[, c("s")], pch = 16, cex = 3)

}
