% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/DROP.R
\name{DROP}
\alias{DROP}
\alias{DROP1}
\alias{DROP1.default}
\alias{DROP1.formula}
\alias{DROP2}
\alias{DROP2.default}
\alias{DROP2.formula}
\alias{DROP3}
\alias{DROP3.default}
\alias{DROP3.formula}
\title{Decremental Reduction Optimization Procedures}
\usage{
\method{DROP1}{formula}(formula, data, ...)

\method{DROP1}{default}(x, k = 1, classColumn = ncol(x), ...)

\method{DROP2}{formula}(formula, data, ...)

\method{DROP2}{default}(x, k = 1, classColumn = ncol(x), ...)

\method{DROP3}{formula}(formula, data, ...)

\method{DROP3}{default}(x, k = 1, classColumn = ncol(x), ...)
}
\arguments{
\item{formula}{A formula describing the classification variable and the attributes to be used.}

\item{data, x}{Data frame containing the tranining dataset to be filtered.}

\item{...}{Optional parameters to be passed to other methods.}

\item{k}{Number of nearest neighbors to be used.}

\item{classColumn}{positive integer indicating the column which contains the
(factor of) classes. By default, the last column is considered.}
}
\value{
An object of class \code{filter}, which is a list with seven components:
\itemize{
   \item \code{cleanData} is a data frame containing the filtered dataset.
   \item \code{remIdx} is a vector of integers indicating the indexes for
   removed instances (i.e. their row number with respect to the original data frame).
   \item \code{repIdx} is a vector of integers indicating the indexes for
   repaired/relabelled instances (i.e. their row number with respect to the original data frame).
   \item \code{repLab} is a factor containing the new labels for repaired instances.
   \item \code{parameters} is a list containing the argument values.
   \item \code{call} contains the original call to the filter.
   \item \code{extraInf} is a character that includes additional interesting
   information not covered by previous items.
}
}
\description{
Similarity-based filters for removing label noise from a dataset as a
preprocessing step of classification. For more information, see 'Details' and
'References' sections.
}
\details{
\code{DROP1} goes over the dataset in the provided order, and removes those
instances whose removal does not decrease the accuracy of the 1-NN rule in
the remaining dataset.

\code{DROP2} introduces two modifications against \code{DROP1}. Regarding the
order of processing instances, \code{DROP2} starts with those which are
furthest from their nearest "enemy" (two instances are said to be "enemies"
if they belong to different classes). Moreover, \code{DROP2} removes an
instance if its removal does not decrease the accuracy of the 1-NN rule in
the \emph{original} dataset (rather than the \emph{remaining} dataset as in
\code{DROP1}).

\code{DROP3} is identical to \code{DROP2}, but it includes a preprocessing
step to clean the borders between classes. It consists of applying the
\code{\link{ENN}} method: any instance misclassified by its k nearest
neighbors is removed.
}
\examples{
# Next example is not run in order to save time
\dontrun{
data(iris)
trainData <- iris[c(1:20,51:70,101:120),]
out1 <- DROP1(Species~ Petal.Length + Petal.Width, data = trainData)
summary(out1, explicit = TRUE)
identical(out1$cleanData, trainData[setdiff(1:nrow(trainData),out1$remIdx),])
}
}
\references{
Wilson D. R., Martinez T. R. (2000): Reduction techniques for
instance-based learning algorithms. \emph{Machine learning}, 38(3), 257-286.
Wilson D. R., Martinez T. R. (1997, July): Instance pruning techniques. In
\emph{ICML} (Vol. 97, pp. 403-411).
}

