% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/knockoff_filter.R
\name{knockoff.filter}
\alias{knockoff.filter}
\title{The Knockoff Filter}
\usage{
knockoff.filter(X, y, knockoffs = create.second_order,
  statistic = stat.glmnet_coefdiff, fdr = 0.1, offset = 1)
}
\arguments{
\item{X}{n-by-p matrix or data frame of predictors.}

\item{y}{response vector of length n.}

\item{knockoffs}{method used to construct knockoffs for the \eqn{X} variables.
It must be a function taking a n-by-p matrix as input and returning a n-by-p matrix of knockoff variables. 
By default, approximate model-X Gaussian knockoffs are used.}

\item{statistic}{statistics used to assess variable importance. By default, 
a lasso statistic with cross-validation is used. See the Details section for more information.}

\item{fdr}{target false discovery rate (default: 0.1).}

\item{offset}{either 0 or 1 (default: 1). This is the offset used to compute the rejection threshold on the
statistics. The value 1 yields a slightly more conservative procedure ("knockoffs+") that
controls the false discovery rate (FDR) according to the usual definition, 
while an offset of 0 controls a modified FDR.}
}
\value{
An object of class "knockoff.result". This object is a list 
 containing at least the following components:
 \item{X}{matrix of original variables}
 \item{Xk}{matrix of knockoff variables}
 \item{statistic}{computed test statistics}
 \item{threshold}{computed selection threshold}
 \item{selected}{named vector of selected variables}
}
\description{
This function runs the Knockoffs procedure from start to finish, selecting variables
relevant for predicting the outcome of interest.
}
\details{
This function creates the knockoffs, computes the importance statistics, 
and selects variables. 
It is the main entry point for the knockoff package.



The parameter \code{knockoffs} controls how knockoff variables are created.
By default, the model-X scenario is assumed and a multivariate normal distribution 
is fitted to the original variables \eqn{X}. The estimated mean vector and the covariance 
matrix are used to generate second-order approximate Gaussian knockoffs.
In general, the function \code{knockoffs} should take a n-by-p matrix of
observed variables \eqn{X} as input and return a n-by-p matrix of knockoffs.
Two default functions for creating knockoffs are provided with this package.

In the model-X scenario, under the assumption that the rows of \eqn{X} are distributed 
as a multivariate Gaussian with known parameters, then the function 
\code{create.gaussian} can be used to generate Gaussian knockoffs, 
as shown in the examples below.

In the fixed-X scenario, one can create the knockoffs using the function 
\code{create.fixed}. This requires \eqn{n \geq p} and it assumes 
that the response \eqn{Y} follows a homoscedastic linear regression model.

For more information about creating knockoffs, type \code{??create}.

The default importance statistic is \link{stat.glmnet_coefdiff}.
For a complete list of the statistics provided with this package, 
type \code{??stat}.

It is possible to provide custom functions for the knockoff constructions 
or the importance statistics. Some examples can be found in the vignette.
}
\examples{
p=200; n=100; k=15
mu = rep(0,p); Sigma = diag(p)
X = matrix(rnorm(n*p),n)
nonzero = sample(p, k)
beta = 3.5 * (1:p \%in\% nonzero)
y = X \%*\% beta + rnorm(n)

# Basic usage with default arguments
result = knockoff.filter(X, y)
print(result$selected)

# Advanced usage with custom arguments
knockoffs = function(X) create.gaussian(X, mu, Sigma)
k_stat = function(X, Xk, y) stat.glmnet_coefdiff(X, Xk, y, nfolds=5)
result = knockoff.filter(X, y, knockoffs=knockoffs, statistic=k_stat)
print(result$selected)


}
\references{
Candes et al., Panning for Gold: Model-free Knockoffs for High-dimensional Controlled Variable Selection,
  arXiv:1610.02351 (2016).
  \href{https://web.stanford.edu/group/candes/knockoffs/index.html}{https://web.stanford.edu/group/candes/knockoffs/index.html}
  
  Barber and Candes,
  Controlling the false discovery rate via knockoffs. 
  Ann. Statist. 43 (2015), no. 5, 2055--2085.
  \href{https://projecteuclid.org/euclid.aos/1438606853}{https://projecteuclid.org/euclid.aos/1438606853}
}
