% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/correlations.R
\name{corr}
\alias{corr}
\title{Correlation table}
\usage{
corr(
  df,
  method = "pearson",
  use = "pairwise.complete.obs",
  pvalue = FALSE,
  padjust = NULL,
  half = FALSE,
  dec = 6,
  ignore = NULL,
  dummy = TRUE,
  redundant = NULL,
  logs = FALSE,
  limit = 10,
  top = NA,
  ...
)
}
\arguments{
\item{df}{Dataframe. It doesn't matter if it's got non-numerical
columns: they will be filtered.}

\item{method}{Character. Any of: c("pearson", "kendall", "spearman").}

\item{use}{Character. Method for computing covariances in the presence
of missing values. Check \code{stats::cor} for options.}

\item{pvalue}{Boolean. Returns a list, with correlations and statistical
significance (p-value) for each value.}

\item{padjust}{Character. NULL to skip or any of \code{p.adjust.methods} to
calculate adjust p-values for multiple comparisons using \code{p.adjust()}.}

\item{half}{Boolean. Return only half of the matrix? The redundant
symmetrical correlations will be \code{NA}.}

\item{dec}{Integer. Number of decimals to round correlations and p-values.}

\item{ignore}{Vector or character. Which column should be ignored?}

\item{dummy}{Boolean. Should One Hot (Smart) Encoding (\code{ohse()})
be applied to categorical columns?}

\item{redundant}{Boolean. Should we keep redundant columns? i.e. If the
column only has two different values, should we keep both new columns?
Is set to \code{NULL}, only binary variables will dump redundant columns.}

\item{logs}{Boolean. Calculate log(x)+1 for numerical columns?}

\item{limit}{Integer. Limit one hot encoding to the n most frequent
values of each column. Set to \code{NA} to ignore argument.}

\item{top}{Integer. Select top N most relevant variables? Filtered
and sorted by mean of each variable's correlations.}

\item{...}{Additional parameters passed to \code{ohse}, \code{corr},
and/or \code{cor.test}.}
}
\value{
data.frame. Squared dimensions (N x N) to match every
correlation between every \code{df} data.frame column/variable. Notice
that when using \code{ohse()} you may get more dimensions.
}
\description{
This function correlates a whole dataframe, running one hot smart
encoding (\code{ohse}) to transform non-numerical features.
Note that it will automatically suppress columns
with less than 3 non missing values and warn the user.
}
\examples{
data(dft) # Titanic dataset
df <- dft[, 2:5]

# Correlation matrix (without redundancy)
corr(df, half = TRUE)

# Ignore specific column
corr(df, ignore = "Pclass")

# Calculate p-values as well
corr(df, pvalue = TRUE, limit = 1)

# Test when no more than 2 non-missing values
df$trash <- c(1, rep(NA, nrow(df) - 1))
# and another method...
corr(df, method = "spearman")
}
\seealso{
Other Calculus: 
\code{\link{dist2d}()},
\code{\link{model_metrics}()},
\code{\link{quants}()}

Other Correlations: 
\code{\link{corr_cross}()},
\code{\link{corr_var}()}
}
\concept{Calculus}
\concept{Correlations}
