% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/05_STS_BINNING.R
\name{sts.bin}
\alias{sts.bin}
\title{Four-stage monotonic binning procedure with statistical test correction}
\usage{
sts.bin(
  x,
  y,
  sc = c(NA, NaN, Inf),
  sc.method = "together",
  y.type = NA,
  min.pct.obs = 0.05,
  min.avg.rate = 0.01,
  p.val = 0.05,
  force.trend = NA
)
}
\arguments{
\item{x}{Numeric vector to be binned.}

\item{y}{Numeric target vector (binary or continuous).}

\item{sc}{Numeric vector with special case elements. Default values are \code{c(NA, NaN, Inf)}.
Recommendation is to keep the default values always and add new ones if needed. Otherwise, if these values exist
in \code{x} and are not defined in the \code{sc} vector, function will report the error.}

\item{sc.method}{Define how special cases will be treated, all together or in separate bins.
Possible values are \code{"together", "separately"}.}

\item{y.type}{Type of \code{y}, possible options are \code{"bina"} (binary) and \code{"cont"} (continuous).
If default value (\code{NA}) is passed, then algorithm will identify if \code{y} is 0/1 or continuous variable.}

\item{min.pct.obs}{Minimum percentage of observations per bin. Default is 0.05 or minimum 30 observations.}

\item{min.avg.rate}{Minimum \code{y} average rate. Default is 0.01 or minimum 1 bad case for y 0/1.}

\item{p.val}{Threshold for p-value of statistical test. Default is 0.05. For binary target test of two proportion
is applied, while for continuous two samples independent t-test.}

\item{force.trend}{If the expected trend should be forced. Possible values: \code{"i"} for
increasing trend (\code{y} increases with increase of \code{x}), \code{"d"} for decreasing trend
(\code{y} decreases with decrease of \code{x}). Default value is \code{NA}.
If the default value is passed, then trend will be identified based on the sign of the Spearman correlation
coefficient between \code{x} and \code{y} on complete cases.}
}
\value{
The command \code{sts.bin} generates a list of two objects. The first object, data frame \code{summary.tbl}
presents a summary table of final binning, while \code{x.trans} is a vector of discretized values.
In case of single unique value for \code{x} or \code{y} of complete cases (cases different than special cases),
it will return data frame with info.
}
\description{
\code{sts.bin} implements extension of the three-stage monotonic binning procedure (\code{\link{iso.bin}})
with final step of iterative merging of adjacent bins based on
statistical test.
}
\examples{
suppressMessages(library(monobin))
data(gcd)
#binary target
maturity.bin <- sts.bin(x = gcd$maturity, y = gcd$qual)
maturity.bin[[1]]
tapply(gcd$qual, maturity.bin[[2]], function(x) c(length(x), sum(x), mean(x)))
prop.test(x = c(sum(gcd$qual[maturity.bin[[2]]\%in\%"01 [4,8)"]), 
	       sum(gcd$qual[maturity.bin[[2]]\%in\%"02 [8,16)"])), 
       n = c(length(gcd$qual[maturity.bin[[2]]\%in\%"01 [4,8)"]),
	       length(gcd$qual[maturity.bin[[2]]\%in\%"02 [8,16)"])), 
       alternative = "less", 
       correct = FALSE)$p.value
#continuous target
age.bin <- sts.bin(x = gcd$age, y = gcd$qual, y.type = "cont")
age.bin[[1]]
t.test(x = gcd$qual[age.bin[[2]]\%in\%"01 [19,26)"], 
    y = gcd$qual[age.bin[[2]]\%in\%"02 [26,35)"],
    alternative = "greater")$p.value

}
\seealso{
\code{\link{iso.bin}} for three-stage monotonic binning procedure.
}
