% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/foo.R
\name{binomial.logistic.MCML}
\alias{binomial.logistic.MCML}
\title{Monte Carlo Maximum Likelihood estimation for the binomial logistic model}
\usage{
binomial.logistic.MCML(formula, units.m, coords, data, ID.coords = NULL, par0,
  control.mcmc, kappa, fixed.rel.nugget = NULL, start.cov.pars,
  method = "BFGS", low.rank = FALSE, knots = NULL, messages = TRUE,
  plot.correlogram = TRUE)
}
\arguments{
\item{formula}{an object of class \code{\link{formula}} (or one that can be coerced to that class): a symbolic description of the model to be fitted.}

\item{units.m}{an object of class \code{\link{formula}} indicating the binomial denominators.}

\item{coords}{an object of class \code{\link{formula}} indicating the geographic coordinates.}

\item{data}{a data frame containing the variables in the model.}

\item{ID.coords}{vector of ID values for the unique set of spatial coordinates obtained from \code{\link{create.ID.coords}}. These must be provided if, for example, spatial random effects are defined at household level but some of the covariates are at individual level. \bold{Warning}: the household coordinates must all be distinct otherwise see \code{\link{jitterDupCoords}}. Default is \code{NULL}.}

\item{par0}{parameters of the importance sampling distribution: these should be given in the following order \code{c(beta,sigma2,phi,tau2)}, where \code{beta} are the regression coefficients, \code{sigma2} is the variance of the Gaussian process, \code{phi} is the scale parameter of the spatial correlation and \code{tau2} is the variance of the nugget effect (if included in the model).}

\item{control.mcmc}{output from \code{\link{control.mcmc.MCML}}.}

\item{kappa}{fixed value for the shape parameter of the Matern covariance function.}

\item{fixed.rel.nugget}{fixed value for the relative variance of the nugget effect; \code{fixed.rel.nugget=NULL} if this should be included in the estimation. Default is \code{fixed.rel.nugget=NULL}.}

\item{start.cov.pars}{a vector of length two with elements corresponding to the starting values of \code{phi} and the relative variance of the nugget effect \code{nu2}, respectively, that are used in the optimization algorithm. If \code{nu2} is fixed through \code{fixed.rel.nugget}, then \code{start.cov.pars} represents the starting value for \code{phi} only.}

\item{method}{method of optimization. If \code{method="BFGS"} then the \code{\link{maxBFGS}} function is used; otherwise \code{method="nlminb"} to use the \code{\link{nlminb}} function. Default is \code{method="BFGS"}.}

\item{low.rank}{logical; if \code{low.rank=TRUE} a low-rank approximation of the Gaussian spatial process is used when fitting the model. Default is \code{low.rank=FALSE}.}

\item{knots}{if \code{low.rank=TRUE}, \code{knots} is a matrix of spatial knots that are used in the low-rank approximation. Default is \code{knots=NULL}.}

\item{messages}{logical; if \code{messages=TRUE} then status messages are printed on the screen (or output device) while the function is running. Default is \code{messages=TRUE}.}

\item{plot.correlogram}{logical; if \code{plot.correlogram=TRUE} the autocorrelation plot of the samples of the random effect is displayed after completion of conditional simulation. Default is \code{plot.correlogram=TRUE}.}
}
\value{
An object of class "PrevMap".
The function \code{\link{summary.PrevMap}} is used to print a summary of the fitted model.
The object is a list with the following components:

\code{estimate}: estimates of the model parameters; use the function \code{\link{coef.PrevMap}} to obtain estimates of covariance parameters on the original scale.

\code{covariance}: covariance matrix of the MCML estimates.

\code{log.lik}: maximum value of the log-likelihood.

\code{y}: binomial observations.

\code{units.m}: binomial denominators.

\code{D}: matrix of covariates.

\code{coords}: matrix of the observed sampling locations.

\code{method}: method of optimization used.

\code{ID.coords}: set of ID values defined through the argument \code{ID.coords}.

\code{kappa}: fixed value of the shape parameter of the Matern function.

\code{knots}: matrix of the spatial knots used in the low-rank approximation.

\code{const.sigma2}: adjustment factor for \code{sigma2} in the low-rank approximation.

\code{h}: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm; see \code{\link{Laplace.sampling}}, or \code{\link{Laplace.sampling.lr}} if a low-rank approximation is used.

\code{samples}: matrix of the random effects samples from the importance sampling distribution used to approximate the likelihood function.

\code{fixed.rel.nugget}: fixed value for the relative variance of the nugget effect.

\code{call}: the matched call.
}
\description{
This function performs Monte Carlo maximum likelihood (MCML) estimation for the geostatistical binomial logistic model.
}
\details{
This function performs parameter estimation for a geostatistical binomial logistic model. Conditionally on a zero-mean stationary Gaussian process \eqn{S(x)} and mutually independent zero-mean Gaussian variables \eqn{Z} with variance \code{tau2}, the observations \code{y} are generated from a binomial distribution with probability \eqn{p} and binomial denominators \code{units.m}. A canonical logistic link is used, thus the linear predictor assumes the form
\deqn{\log(p/(1-p)) = d'\beta + S(x) + Z,}
where \eqn{d} is a vector of covariates with associated regression coefficients \eqn{\beta}. The Gaussian process \eqn{S(x)} has isotropic Matern covariance function (see \code{\link{matern}}) with variance \code{sigma2}, scale parameter \code{phi} and shape parameter \code{kappa}. 
In the \code{binomial.logistic.MCML} function, the shape parameter is treated as fixed. The relative variance of the nugget effect, \code{nu2=tau2/sigma2}, can also be fixed through the argument \code{fixed.rel.nugget}; if \code{fixed.rel.nugget=NULL}, then the relative variance of the nugget effect is also included in the estimation.

\bold{Monte Carlo Maximum likelihood.}
The Monte Carlo maximum likelihood method uses conditional simulation from the distribution of the random effect \eqn{T(x) = d(x)'\beta+S(x)+Z} given the data \code{y}, in order to approximate the high-dimensiional intractable integral given by the likelihood function. The resulting approximation of the likelihood is then maximized by a numerical optimization algorithm which uses analytic epression for computation of the gradient vector and Hessian matrix. The functions used for numerical optimization are \code{\link{maxBFGS}} (\code{method="BFGS"}), from the \pkg{maxLik} package, and \code{\link{nlminb}} (\code{method="nlminb"}).

\bold{Using a two-level model to include household-level and individual-level information.}
When analysing data from household sruveys, some of the avilable information information might be at household-level (e.g. material of house, temperature) and some at individual-level (e.g. age, gender). In this case, the Gaussian spatial process \eqn{S(x)} and the nugget effect \eqn{Z} are defined at hosuehold-level in order to account for extra-binomial variation between and within households, respectively. 

\bold{Low-rank approximation.}
In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process \eqn{S(x)} might be computationally beneficial. Let \eqn{(x_{1},\dots,x_{m})} and \eqn{(t_{1},\dots,t_{m})} denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then \eqn{S(x)} is approximated as \eqn{\sum_{i=1}^m K(\|x-t_{i}\|; \phi, \kappa)U_{i}}, where \eqn{U_{i}} are zero-mean mutually independent Gaussian variables with variance \code{sigma2} and \eqn{K(.;\phi, \kappa)} is the isotropic Matern kernel (see \code{\link{matern.kernel}}). Since the resulting approximation is no longer a stationary process (but only approximately), the parameter \code{sigma2} is then multiplied by a factor \code{constant.sigma2} so as to obtain a value that is closer to the actual variance of \eqn{S(x)}.
}
\examples{
set.seed(1234)
data(data_sim)
# Select a subset of data_sim with 50 observations
n.subset <- 10
data_subset <- data_sim[sample(1:nrow(data_sim),n.subset),]

# Set the MCMC control parameters
control.mcmc <- control.mcmc.MCML(n.sim=1000,burnin=0,thin=1,
                           h=1.65/(n.subset^2/3))   

# Set the parameters of the importance sampling distribution
par0 <- c(0,1,0.15)

# Estimate the model parameters using MCML
fit.MCML <- binomial.logistic.MCML(y~1, units.m=~units.m, coords=~x1+x2,
data=data_subset, control.mcmc=control.mcmc,
fixed.rel.nugget=0,par0=par0,start.cov.pars=0.15,method="nlminb",kappa=2)            
summary(fit.MCML,log.cov.pars=FALSE)
coef(fit.MCML)

}
\author{
Emanuele Giorgi \email{e.giorgi@lancaster.ac.uk}

Peter J. Diggle \email{p.diggle@lancaster.ac.uk}
}
\references{
Christensen, O. F. (2004). \emph{Monte carlo maximum likelihood in model-based geostatistics.} Journal of Computational and Graphical Statistics 13, 702-718.

Higdon, D. (1998). \emph{A process-convolution approach to modeling temperatures in the North Atlantic Ocean.} Environmental and Ecological Statistics 5, 173-190.
}
\seealso{
\code{\link{Laplace.sampling}}, \code{\link{Laplace.sampling.lr}}, \code{\link{summary.PrevMap}}, \code{\link{coef.PrevMap}}, \code{\link{matern}}, \code{\link{matern.kernel}},  \code{\link{control.mcmc.MCML}}, \code{\link{create.ID.coords}}.
}

