% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pmi_svd.R
\name{get_svd}
\alias{get_svd}
\title{Compute random singular value decomposition (rSVD)}
\usage{
get_svd(m_pmi, embedding_dim = 100, svd_rank = embedding_dim * 2)
}
\arguments{
\item{m_pmi}{Pointwise mutual information matrix.}

\item{embedding_dim}{Number of output embedding dimensions requested.}

\item{svd_rank}{Number of SVD dimensions to compute.}
}
\value{
SVD rectangular matrix
}
\description{
Random SVD is an efficient approximation of truncated SVD, in which only the
first principal components are returned. It is computed with the rsvd
package, and the author suggests that the number of dimensions requested k
should be: k < n / 4, where n is the number of features, for it to be
efficient, and that otherwise one should rather use either SVD or truncated
SVD.
When computing SVD on PMI, we only want to use the singular values
corresponding to the positive eigen values. We do not know beforehand how
many we will have to filter out, so there is two parameters: 'embedding_dim'
for the requested output dimension, and 'svd_rank' for the actual SVD
computation, by default twice the requested dimension, and a warning may be
thrown if 'svd_rank' needs to be manually increased.
Computation may be expensive and manually optimizing the 'svd_rank'
parameter might save significant time.
}
\examples{
df_ehr = data.frame(Patient = c(1, 1, 2, 1, 2, 1, 1, 3, 4),
                    Month = c(1, 1, 1, 2, 2, 3, 3, 4, 4),
                    Parent_Code = c('C1', 'C2', 'C2', 'C1', 'C1', 'C1',
                                    'C2', 'C3', 'C4'),
                    Count = 1:9)

spm_cooc = build_df_cooc(df_ehr)

m_pmi = get_pmi(spm_cooc)
m_svd = get_svd(m_pmi, embedding_dim = 2)

}
