% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/subtyping-omics-data.R
\name{SubtypingOmicsData}
\alias{SubtypingOmicsData}
\title{Subtyping multi-omics data}
\usage{
SubtypingOmicsData(
  dataList,
  kMax = 5,
  agreementCutoff = 0.5,
  ncore = 1,
  verbose = T,
  ...
)
}
\arguments{
\item{dataList}{a list of data matrices. Each matrix represents a data type where the rows are items and the columns are features. The matrices must have the same set of items.}

\item{kMax}{the maximum number of clusters. Default value is \code{5}.}

\item{agreementCutoff}{agreement threshold to be considered consistent. Default value is \code{0.5}.}

\item{ncore}{Number of cores that the algorithm should use. Default value is \code{1}.}

\item{verbose}{set it to \code{TRUE} of \code{FALSE} to get more or less details respectively.}

\item{...}{these arguments will be passed to \code{PerturbationClustering} algorithm. See details for more information}
}
\value{
\code{SubtypingOmicsData} returns a list with at least the following components:
\item{cluster1}{A vector of labels indicating the cluster to which each sample is allocated in Stage I}
\item{cluster2}{A vector of labels indicating the cluster to which each sample is allocated in Stage II}
\item{dataTypeResult}{A list of results for individual data type. Each element of the list is the result of the \code{PerturbationClustering} for the corresponding data matrix provided in dataList.}
}
\description{
Perform subtyping using multiple types of data
}
\details{
\code{SubtypingOmicsData} implements the Subtyping multi-omic data that are based on Perturbaion clustering algorithm of Nguyen, et al (2017) and Nguyen, et al (2019).
The input is  a list of data matrices where each matrix represents the molecular measurements of a data type. The input matrices must have the same number of rows. 
\code{SubtypingOmicsData} aims to find the optimum number of subtypes and location of each sample in the clusters from integrated input data \code{dataList} through two processing stages:

1. Stage I: The algorithm first partitions each data type using the function \code{PerturbationClustering}.
It then merges the connectivities across data types into similarity matrices.
Both kmeans and similarity-based clustering algorithms - partitioning around medoids \code{pam} are used to partition the built similarity.
The algorithm returns the partitioning that agrees the most with individual data types.\cr
2. Stage II: The algorithm attempts to split each discovered group if there is a strong agreement between data types,
or if the subtyping in Stage I is very unbalanced.
}
\examples{
\donttest{
# Load the kidney cancer carcinoma data
data(KIRC)

# Perform subtyping on the multi-omics data
dataList <- list (as.matrix(KIRC$GE), as.matrix(KIRC$ME), as.matrix(KIRC$MI)) 
names(dataList) <- c("GE", "ME", "MI")
result <- SubtypingOmicsData(dataList = dataList)

# Change Pertubation clustering algorithm's arguments
result <- SubtypingOmicsData(
    dataList = dataList, 
    clusteringMethod = "kmeans", 
    clusteringOptions = list(nstart = 50)
)

# Plot the Kaplan-Meier curves and calculate Cox p-value
library(survival)
cluster1=result$cluster1;cluster2=result$cluster2
a <- intersect(unique(cluster2), unique(cluster1))
names(a) <- intersect(unique(cluster2), unique(cluster1))
a[setdiff(unique(cluster2), unique(cluster1))] <- seq(setdiff(unique(cluster2), unique(cluster1))) 
                                                  + max(cluster1)
colors <- a[levels(factor(cluster2))]
coxFit <- coxph(
 Surv(time = Survival, event = Death) ~ as.factor(cluster2),
 data = KIRC$survival,
 ties = "exact"
)
mfit <- survfit(Surv(Survival, Death == 1) ~ as.factor(cluster2), data = KIRC$survival)
plot(
 mfit, col = colors,
 main = "Survival curves for KIRC, level 2",
 xlab = "Days", ylab = "Survival",lwd = 2
)
legend("bottomright", 
    legend = paste(
        "Cox p-value:", 
        round(summary(coxFit)$sctest[3], digits = 5), 
        sep = ""
    )
)
legend(
    "bottomleft",
    fill = colors,
    legend = paste(
        "Group ",
        levels(factor(cluster2)),": ", table(cluster2)[levels(factor(cluster2))], 
        sep =""
    )
)

}
}
\references{
1. H Nguyen, S Shrestha, S Draghici, & T Nguyen. PINSPlus: a tool for tumor subtype discovery in integrated genomic data. Bioinformatics, 35(16), 2843-2846, (2019).

2. T Nguyen, R Tagett, D Diaz, S Draghici. A novel method for data integration and disease subtyping. Genome Research, 27(12):2025-2039, 2017.

3. T. Nguyen, "Horizontal and vertical integration of bio-molecular data", PhD thesis, Wayne State University, 2017.
}
\seealso{
\code{\link{PerturbationClustering}}
}
