% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/Functions.R
\name{SelectnrClusters}
\alias{SelectnrClusters}
\title{Determines an optimal number of clusters based on silhouette widths}
\usage{
SelectnrClusters(List, type = c("data", "dist", "pam"),
  distmeasure = c("tanimoto", "tanimoto"), normalize = c(FALSE, FALSE),
  method = c(NULL, NULL), nrclusters = seq(5, 25, 1), names = NULL,
  StopRange = FALSE, plottype = "new", location = NULL)
}
\arguments{
\item{List}{A list of data matrices. It is assumed the rows are corresponding with the objects.}

\item{type}{indicates whether the provided matrices in "List" are either data matrices, distance
matrices or clustering results obtained from the data. If type="dist" the calculation of the distance
matrices is skipped and if type="clusters" the single source clustering is skipped.
Type should be one of "data", "dist" or "clusters".}

\item{distmeasure}{A vector of the distance measures to be used on each data matrix. Should be one of "tanimoto", "euclidean", "jaccard", "hamming". Defaults to c("tanimoto","tanimoto").}

\item{normalize}{Logical. Indicates whether to normalize the distance matrices or not, defaults to c(FALSE, FALSE) for two data sets. This is recommended if different distance types are used. More details on normalization in \code{Normalization}.}

\item{method}{A method of normalization. Should be one of "Quantile","Fisher-Yates", "standardize","Range" or any of the first letters of these names. Default is c(NULL,NULL) for two data sets.}

\item{nrclusters}{A sequence of numbers of clusters to cut the dendrogram in. Default is a sequence of 5 to 25.}

\item{names}{The labels to give to the elements in List. Default is NULL.}

\item{StopRange}{Logical. Indicates whether the distance matrices with
values not between zero and one should be standardized to have so. If FALSE
the range normalization is performed. See \code{Normalization}. If TRUE, the
distance matrices are not changed. This is recommended if different types of
data are used such that these are comparable. Default is FALSE.}

\item{plottype}{Should be one of "pdf","new" or "sweave". If "pdf", a
location should be provided in "location" and the figure is saved there. If
"new" a new graphic device is opened and if "sweave", the figure is made
compatible to appear in a sweave or knitr document, i.e. no new device is
opened and the plot appears in the current device or document. Default is "new".}

\item{location}{If plottype is "pdf", a location should be provided in
"location" and the figure is saved there. Default is NULL.}
}
\value{
A plots are made showing the average silhouette widths of the
provided objects for each number of clusters. Further, a list with two
elements is returned: \item{Silhouette_Widths}{A data frame with the
silhouette widths for each object and the average silhouette widths per
number of clusters} \item{Optimal_Nr_of_CLusters}{The determined optimal
number of cluster }
}
\description{
The function \code{SelectnrClusters} determines an optimal optimal number of
clusters based by calculating silhouettes widths for a sequence of clusters.
See "Details" for a more elaborate description.

If the object provided in List are data or distance matrices clustering
around medoids is performed with the \code{pam} function of the
\pkg{cluster} package. Of the obtained pam objects, average silhouette
widths are retrieved. A silhouette width represents how well an object lies
in its current cluster. Values around one are an indication of an
appropriate clustering while values around zero show that the object might
as well lie in the neighbouring cluster. The average silhouette width is a
measure of how tightly grouped the data is.  This is performed for every
number of cluster for every object provided in List. Then the average is
taken for every number of clusters over the provided objects. This results
in one average value per number of clusters. The number width the maximal
average silhouette width is chosen as the optimal number of clusters.
}
\examples{
\dontrun{
data(fingerprintMat)
data(targetMat)

L=list(fingerprintMat,targetMat)

NrClusters=SelectnrClusters(List=L,type="data",distmeasure=c("tanimoto",
"tanimoto"),nrclusters=seq(5,10),normalize=c(FALSE,FALSE),method=c(NULL,NULL),
names=c("FP","TP"),StopRange=FALSE,plottype="new",location=NULL)

NrClusters
}
}
