% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/single_marker_test.R
\name{lma}
\alias{lma}
\title{Single marker association analysis using linear models or linear mixed models}
\usage{
lma(
  y = NULL,
  X = NULL,
  W = NULL,
  Glist = NULL,
  fit = NULL,
  statistic = "mastor",
  ids = NULL,
  rsids = NULL,
  msize = 100,
  scale = TRUE
)
}
\arguments{
\item{y}{vector or matrix of phenotypes}

\item{X}{design matrix for factors modeled as fixed effects}

\item{W}{matrix of centered and scaled genotypes}

\item{Glist}{list of information about genotype matrix stored on disk}

\item{fit}{list of information about linear mixed model fit (output from greml)}

\item{statistic}{single marker test statistic used (currently based on the "mastor" statistics).}

\item{ids}{vector of individuals used in the analysis}

\item{rsids}{vector of marker rsids used in the analysis}

\item{msize}{number of genotype markers used for batch processing}

\item{scale}{logical if TRUE the genotypes have been scaled to mean zero and variance one}
}
\value{
Returns a dataframe (if number of traits = 1) else a list including
\item{coef}{single marker coefficients}
\item{se}{standard error of coefficients}
\item{stat}{single marker test statistic}
\item{p}{p-value}
}
\description{
The function lma performs single marker association analysis between genotype markers and the phenotype
either based on linear model analysis (LMA) or mixed linear model analysis (MLMA).

The basic MLMA approach involves 1) building a genetic relationship matrix (GRM) that models genome-wide
sample structure, 2) estimating the contribution of the GRM to phenotypic variance using a random effects model
(with or without additional fixed effects) and 3) computing association statistics that account for this component
on phenotypic variance.

MLMA methods are the method of choice when conducting association mapping in the presence of sample structure,
including geographic population structure, family relatedness and/or cryptic relatedness. MLMA methods prevent
false positive associations and increase power. The general recommendation when using MLMA is to exclude candidate
markers from the GRM. This can be efficiently implemented via a leave-one-chromosome-out analysis.
Further, it is recommend that analyses of randomly ascertained quantitative traits should include all markers
(except for the candidate marker and markers in LD with the candidate marker) in the GRM, except as follows.
First, the set of markers included in the GRM can be pruned by LD to reduce running time (with association
statistics still computed for all markers). Second, genome-wide significant markers of large effect should be
conditioned out as fixed effects or as an additional random effect (if a large number of associated markers).
Third, when population stratification is less of a concern, it may be useful using the top associated markers
selected based on the global maximum from out-of sample predictive accuracy.
}
\examples{

# Simulate data
W <- matrix(rnorm(1000000), ncol = 1000)
	colnames(W) <- as.character(1:ncol(W))
	rownames(W) <- as.character(1:nrow(W))
y <- rowSums(W[, 1:10]) + rowSums(W[, 501:510]) + rnorm(nrow(W))

# Create model
data <- data.frame(y = y, mu = 1)
fm <- y ~ 0 + mu
X <- model.matrix(fm, data = data)

# Linear model analyses and single marker association test
maLM <- lma(y=y,X=X,W = W)

head(maLM)

\donttest{
# Compute GRM
GRM <- grm(W = W)

# Estimate variance components using REML analysis
fit <- greml(y = y, X = X, GRM = list(GRM), verbose = TRUE)

# Single marker association test
maMLM <- lma(fit = fit, W = W)

head(maMLM)

}

}
\references{
Chen, W. M., & Abecasis, G. R. (2007). Family-based association tests for genomewide association scans. The American Journal of Human Genetics, 81(5), 913-926.

Loh, P. R., Tucker, G., Bulik-Sullivan, B. K., Vilhjalmsson, B. J., Finucane, H. K., Salem, R. M., ... & Patterson, N. (2015). Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nature genetics, 47(3), 284-290.

Kang, H. M., Sul, J. H., Zaitlen, N. A., Kong, S. Y., Freimer, N. B., Sabatti, C., & Eskin, E. (2010). Variance component model to account for sample structure in genome-wide association studies. Nature genetics, 42(4), 348-354.

Lippert, C., Listgarten, J., Liu, Y., Kadie, C. M., Davidson, R. I., & Heckerman, D. (2011). FaST linear mixed models for genome-wide association studies. Nature methods, 8(10), 833-835.

Listgarten, J., Lippert, C., Kadie, C. M., Davidson, R. I., Eskin, E., & Heckerman, D. (2012). Improved linear mixed models for genome-wide association studies. Nature methods, 9(6), 525-526.

Listgarten, J., Lippert, C., & Heckerman, D. (2013). FaST-LMM-Select for addressing confounding from spatial structure and rare variants. Nature Genetics, 45(5), 470-471.

Lippert, C., Quon, G., Kang, E. Y., Kadie, C. M., Listgarten, J., & Heckerman, D. (2013). The benefits of selecting phenotype-specific variants for applications of mixed models in genomics. Scientific reports, 3.

Zhou, X., & Stephens, M. (2012). Genome-wide efficient mixed-model analysis for association studies. Nature genetics, 44(7), 821-824.

Svishcheva, G. R., Axenovich, T. I., Belonogova, N. M., van Duijn, C. M., & Aulchenko, Y. S. (2012). Rapid variance components-based method for whole-genome association analysis. Nature genetics, 44(10), 1166-1170.

Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M., & Price, A. L. (2014). Advantages and pitfalls in the application of mixed-model association methods. Nature genetics, 46(2), 100-106.

Bulik-Sullivan, B. K., Loh, P. R., Finucane, H. K., Ripke, S., Yang, J., Patterson, N., ... & Schizophrenia Working Group of the Psychiatric Genomics Consortium. (2015). LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature genetics, 47(3), 291-295.
}
\author{
Peter Soerensen
}
