% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ind_init.R
\name{ind_init}
\alias{ind_init}
\title{Initialization of indicator-pressure models}
\usage{
ind_init(ind_tbl, press_tbl, time, train = 0.9, random = FALSE)
}
\arguments{
\item{ind_tbl}{A data frame, matrix or tibble containing only the (numeric) IND
variables. Single indicators should be coerced into a data frame to keep the
indicator name. If kept as vector, default name will be  `ind`.}

\item{press_tbl}{A data frame, matrix or tibble containing only the (numeric)
pressure variables. Single pressures should be coerced into a data frame to keep
the pressure name. If kept as vector, default name will be `press`.}

\item{time}{A vector containing the actual time steps (e.g. years; should be the same
as in the IND and pressure data).}

\item{train}{The proportion of observations that should go into the training data
on which the GAMs are later fitted. Has to be a numeric value between 0 and 1;
the default is 0.9.}

\item{random}{logical; should the observations for the training data be randomly
chosen? Default is FALSE, so that the last time units (years) are chosen as test data.}
}
\value{
The function returns a \code{\link[tibble]{tibble}}, which is a trimmed down version of
the data.frame(), including the following elements:
\describe{
  \item{\code{id}}{Numerical IDs for the IND~press combinations.}
  \item{\code{ind}}{Indicator names.}
  \item{\code{press}}{Pressure names.}
  \item{\code{ind_train}}{A list-column with indicator values of the training data.}
  \item{\code{press_train}}{A list-column with pressure values of the training data.}
  \item{\code{time_train}}{A list-column with the time steps of the training data.}
  \item{\code{ind_test}}{A list-column with indicator values of the test data.}
  \item{\code{press_test}}{A list-column with pressure values of the test data.}
  \item{\code{time_test}}{A list-column with the time steps of the test data.}
  \item{\code{train_na}}{logical; indicates the joint missing values in the training
  IND and pressure data. That includes the original NAs as well as randomly selected
  test observations that are within the training period. This vector is needed later
  for the determination of temporal autocorrelation.}
}
}
\description{
\code{ind_init} combines the time vector and the indicator (IND) and pressure data into
one tibble with defined training and test observations. All INDs are combined
with all pressures provided as input.
}
\details{
\code{ind_init} will combine every column in ind_tbl with every column in press_tbl
so that each row will represent one IND~press combination. The input data will be
split into a training and a test data set. The returned tibble is the basis for all
IND~pressure modeling functions.

If not all IND~pressure combinations should be modeled,
the respective rows can simply be removed from the output tibble or \code{ind_init} is
applied multiple times on data subsets and their output tibbles merged later using
e.g. \code{\link[dplyr]{bind_rows}}.
}
\examples{
# Using the Baltic Sea demo data in this package
press_tbl <- press_ex[ ,-1] # excl. Year
ind_tbl <- ind_ex[ ,-1] # excl. Year
time <- ind_ex[ ,1]
# Assign randomly 50\% of the observations as training data and
# the other 50\% as test data
ind_init(ind_tbl, press_tbl, time, train = 0.5, random = TRUE)
# To keep the name when testing only one indicator and pressure, coerce both vectors
# data frames
ind_init(ind_tbl = data.frame(MS = ind_tbl$MS), press_tbl = data.frame(Tsum = press_tbl$Tsum),
 time, train = .5, random = TRUE)
}
\seealso{
\code{\link[tibble]{tibble}} and the \code{vignette("tibble")} for more
 informations on tibbles

Other IND~pressure modeling functions: \code{\link{find_id}},
  \code{\link{model_gamm}}, \code{\link{model_gam}},
  \code{\link{plot_diagnostics}}, \code{\link{plot_model}},
  \code{\link{scoring}}, \code{\link{select_model}},
  \code{\link{test_interaction}}
}
