% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fullRepertoire.R
\name{fullRepertoire}
\alias{fullRepertoire}
\title{Simulates full heavy chain antibody repertoires for either human or mice.}
\usage{
fullRepertoire(max.seq.num, max.timer, SHM.method, baseline.mut,
  SHM.branch.prob, SHM.branch.param, SHM.nuc.prob, species, VDJ.branch.prob,
  proportion.sampled, sample.time, max.tree.num, chain.type, vdj.model,
  vdj.insertion.mean, vdj.insertion.stdv)
}
\arguments{
\item{max.seq.num}{The maximum number of tips allowed at the end of the simulation. The simulation
will end when either this or the max.timer is reached. Note - this function does not take clonal
frequency into account. This parameter resembles the species richness, or the measure of unique sequences
in the repertoire.}

\item{max.timer}{The maximum number of time steps allowed during the simulation. The simulation
will end when either this or the max.seq.num is reached.}

\item{SHM.method}{The mode of SHM speciation events. Options are either: "poisson","data","motif","wrc", and "all". Specifying
"poisson" will result in mutations that can occur anywhere in the heavy chain region, with each nucleotide having an equal probability
for a mutation event. Specifying "data" focuses mutation events during SHM in the CDR regions (based on IMGT), and
there will be an increased probability for transitions (and decreased probability for transversions). Specifying
"motif" will cause neighbor dependent mutations based on a mutational matrix from high throughput sequencing
data sets (Yaari et al., Frontiers in Immunology, 2013). "wrc" allows for only the WRC mutational hotspots
to be included (where W equals A or T and R equals A or G). Specifying "all" will use all four types of mutations
during SHM branching events, where the weights for each can be specified in the "SHM.nuc.prob" parameter.}

\item{baseline.mut}{Specifies the probability (gamma) for each nucleotide to be mutated inbetween speciation events. These
mutations do not cause any branching events. This parameter gives each site a probability to be mutated
(in all current sequences) at each time step. Currently these are only Poisson distributed but future
releases will change it to allow for other mutation methods.}

\item{SHM.branch.prob}{Specifies the probability for a given sequence to undergo SHM events (thus, branching events)
This parameter corresponds to the distribution specified in "SHM.branch.prob". For "identical" only one value
should be supplied. For "uniform", a vector of length 3 should be specified corresponding to n,min,max respectively
(stats::runif(n, min = 0, max = 1)). For "exponential", a single value controlling the rate parameter (from stats::rexp()) should be supplied. For "lognorm"
a vector of length two should be supplied, with the first value corresponding to meanlog and the second
corresponding to sdlog (from stats::rlnorm). Similarly, for "normal" distribution, two values corresponding to
the mean and standard deviation (respectively) should be supplied.}

\item{SHM.branch.param}{Describes the probability of undergoing SHM events. This parameter is responsible for
describing how likely each sequence will undergo branching events in the phylogeny. The following options are
possible: "identical", "uniform", "exponential" ("exp"), "lognormal" ("lognorm"), "normal" ("norm").}

\item{SHM.nuc.prob}{Specifies the rate at which nucleotides change during speciation (SHM) events. This parameter
depends on the type of mutation specified by SHM.method. For both "poisson" and "data", the input value determines the probability
for each site to mutate (the whole sequence for "poisson" and the CDRs for "data"). For either "motif" or "wrc", the number of
mutations per speciation event should be specified. Note that these are not probabilities, but the number of mutations
that can occur (if the mutation is present in the sequence). If "all" is specified, the input should be a vector
where the first element controls the poisson style mutations, second controls the "data", third controls the "motif"
and fourth controls the "wrc".}

\item{species}{Either "mus" for C57BL/6 germline genes or "hum" for human germline genes. These genes were
taking from IMGT. When more than one allele was present for a given gene, the first was used.}

\item{VDJ.branch.prob}{The probabilty of a new VDJ recombination event of occuring. For the singleLineage function
this will result in a branching event at the site of the unmutated germline. For fullRepertoire function, this
will cause a new tree to begin.}

\item{proportion.sampled}{Value ranging from 0 and 1 specifying the proportion of sequences to be sampled at each time point.
Specifiying 1 indicates that all sequences will be recovered at each time point, whereas 0.5 will sample half of the
sequences.}

\item{sample.time}{Integer array indicating the time points at which sampling events should occur.}

\item{max.tree.num}{Integer value describing maximum number of trees allowed
to generate the core sequences of the repertoire. Each of these trees is started
by an independent VDJ recombination event.}

\item{chain.type}{String determining whether heavy or light chain should
be simulated. Either "heavy" for heavy chains or "light" for light
chains. Heavy chains will have V-D-J recombination, whereas light chain
will just have V-J recombination.}

\item{vdj.model}{Specifies the model used to simulate V-D-J recombination. Can be
either "naive" or "data". "naive" is chain independent and does not differentiate
between different species. To rely on the default "experimental" options, this
should be "data" and the parameter vdj.insertion.mean should be "default". This will
allow for different mean additions for either the VD and JD junctions and will
differ depending on species.}

\item{vdj.insertion.mean}{Integer value describing the mean number of nucleotides to be inserted during
simulated V-D-J recombination events. If "default" is entered, the mean will be normally distribut}

\item{vdj.insertion.stdv}{Integer value describing the standard deviation corresponding to insertions
of V-D-J recombination. No "default" parameter currently supported but will be updated
with future experimental data. This should be a number if using a custom distribution
for V-D-J recombination events, but can be "default" if using the "naive" vdj.model
or the "data", with vdj.insertion.mean set to "default".}
}
\value{
Returns a nested list. output[[1]][[1]] is an array of the simulated sequences
output[[2]][[1]] is an array names corresponding to each sequence. For example, output[[2]][[1]][1]
is the name of the sequence corresponding to output[[1]][[1]][1]. The simulated tree of this is found in
output[[3]][[1]]. The length of the output list is determined by the number of sampling points
Thus if you have two sampling points, output[[4]][[1]] would be a character array holding the sequences
with output[[5]][[1]] as a character array holding the corresponding names. Then the sequences recovered
second sampling point would be stored at output[[6]][[1]], with the names at output[[7]][[1]]. This
nested list was designed for full antibody repertoire simulations, and thus, may seem unintuitive
for the single lineage function. The first sequence and name corresponds to the germline sequence
that served as the root of the tree. See vignette for comprehensive example
}
\description{
Simulates full heavy chain antibody repertoires for either human or mice.
}
\seealso{
singleLineage
}

