% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/h_selection.R
\name{select_h}
\alias{select_h}
\title{Select the value of the kernel tuning parameter}
\usage{
select_h(
  x,
  y = NULL,
  alternative = NULL,
  method = "subsampling",
  b = 0.8,
  B = 100,
  delta_dim = 1,
  delta = NULL,
  h_values = NULL,
  Nrep = 50,
  n_cores = 2,
  Quantile = 0.95,
  power.plot = TRUE
)
}
\arguments{
\item{x}{Data set of observations from X.}

\item{y}{Numeric matrix or vector of data values. Depending on the input
\code{y}, the selection of h is performed for the corresponding
test.
\itemize{
\item if \code{y} = NULL, the function performs the tests for normality on
\code{x}.
\item if \code{y} is a data matrix, with same dimensions of \code{x}, the
function performs the two-sample test between \code{x} and \code{y}.
\item if \code{y} is a numeric or factor vector, indicating the group
memberships for each observation, the function performs the k-sample
test.
}}

\item{alternative}{Family of alternative chosen for selecting h, between
"location", "scale" and "skewness".}

\item{method}{The method used for critical value estimation
("subsampling", "bootstrap", or "permutation").}

\item{b}{The size of the subsamples used in the subsampling algorithm .}

\item{B}{The number of iterations to use for critical value estimation,
B = 150 as default.}

\item{delta_dim}{Vector of coefficient of alternative with respect to each
dimension}

\item{delta}{Vector of parameter values indicating chosen alternatives}

\item{h_values}{Values of the tuning parameter used for the selection}

\item{Nrep}{Number of bootstrap/permutation/subsampling replications.}

\item{n_cores}{Number of cores used to parallel the h selection algorithm.
If this is not provided, the function will detect the
available cores.}

\item{Quantile}{The quantile to use for critical value estimation, 0.95 is
the default value.}

\item{power.plot}{Logical. If TRUE, it is displayed the plot of power for
values in h_values and delta.}
}
\value{
A list with the following attributes:
\itemize{
\item \code{h_sel} the selected value of tuning parameter h;
\item \code{power} matrix of power values computed for the considered
values of \code{delta} and \code{h_values};
\item \code{power.plot} power plots (if \code{power.plot} is \code{TRUE}).
}
}
\description{
This function computes the kernel bandwidth of the Gaussian kernel for the
normality, two-sample and k-sample kernel-based quadratic distance (KBQD)
tests.
}
\details{
The function performs the selection of the optimal value for the tuning
parameter \eqn{h} of the normal kernel function, for normality test, the
two-sample and k-sample KBQD tests. It performs a small simulation study,
generating samples according to the family of \code{alternative} specified,
for the chosen values of \code{h_values} and \code{delta}.

We consider target alternatives \eqn{F_\delta(\hat{\mathbf{\mu}},
\hat{\mathbf{\Sigma}}, \hat{\mathbf{\lambda}})}, where
\eqn{\hat{\mathbf{\mu}}, \hat{\mathbf{\Sigma}}} and
\eqn{\hat{\mathbf{\lambda}}} indicate the location,
covariance and skewness parameter estimates from the pooled sample.
\itemize{
\item Compute the estimates of the mean \eqn{\hat{\mu}}, covariance matrix
\eqn{\hat{\Sigma}} and skewness \eqn{\hat{\lambda}} from the pooled sample.
\item Choose the family of alternatives \eqn{F_\delta = F_\delta(\hat{\mu}
,\hat{\Sigma}, \hat{\lambda})}. \cr \cr
\emph{For each value of \eqn{\delta} and \eqn{h}:}
\item Generate \eqn{\mathbf{X}_1,\ldots,\mathbf{X}_{k-1}  \sim F_0}, for
\eqn{\delta=0};
\item Generate \eqn{\mathbf{X}_k \sim F_\delta};
\item Compute the \eqn{k}-sample test statistic between \eqn{\mathbf{X}_1, 
\mathbf{X}_2, \ldots, \mathbf{X}_k} with kernel parameter \eqn{h};
\item Compute the power of the test. If it is greater than 0.5,
select \eqn{h} as optimal value.
\item If an optimal value has not been selected, choose the \eqn{h} which
corresponds to maximum power.
}

The available \code{alternative} are \cr
\emph{location} alternatives, \eqn{F_\delta = 
SN_d(\hat{\mu} + \delta,\hat{\Sigma}, \hat{\lambda})},with
\eqn{\delta = 0.2, 0.3, 0.4}; \cr
\emph{scale} alternatives,
\eqn{F_\delta = SN_d(\hat{\mu} ,\hat{\Sigma}*\delta, \hat{\lambda})},
\eqn{\delta = 0.1, 0.3, 0.5}; \cr
\emph{skewness} alternatives,
\eqn{F_\delta = SN_d(\hat{\mu} ,\hat{\Sigma}, \hat{\lambda} + \delta)},
with \eqn{\delta = 0.2, 0.3, 0.6}. \cr
The values of \eqn{h = 0.6, 1, 1.4, 1.8, 2.2} and \eqn{N=50} are set as
default values. \cr
The function \code{select_h()} allows the user to
set the values of \eqn{\delta} and \eqn{h} for a more extensive grid search.
We suggest to set a more extensive grid search when computational resources
permit.
}
\note{
Please be aware that the \code{select_h()} function may take a significant
amount of time to run, especially with larger datasets or when using an
larger number of parameters in \code{h_values} and \code{delta}. Consider
this when applying the function to large or complex data.
}
\examples{
# Select the value of h using the mid-power algorithm
\donttest{
x <- matrix(rnorm(100),ncol=2)
y <- matrix(rnorm(100),ncol=2)
h_sel <- select_h(x,y,"skewness")
h_sel
}

}
\references{
Markatou, M. and Saraceno, G. (2024). “A Unified Framework for
Multivariate Two- and k-Sample Kernel-based Quadratic Distance
Goodness-of-Fit Tests.” \cr
https://doi.org/10.48550/arXiv.2407.16374

Saraceno, G., Markatou, M., Mukhopadhyay, R. and Golzy, M. (2024).
Goodness-of-Fit and Clustering of Spherical Data: the QuadratiK package
in R and Python. \cr
https://arxiv.org/abs/2402.02290.
}
\seealso{
The function \code{select_h} is used in the \code{\link[=kb.test]{kb.test()}} function.
}
