COSMONET 0.1.0
This tutorial is a guide on how to run and use the COSMONET R package
. It is available from http://bioinfo.na.iac.cnr.it/cosmonet/.
COSMONET
depends on different R packages such as APML0, survival, ggpurb, igraph, pheamap.
COSMONET
can be installed through the following commands in the R console.
# ## From the command line
# ### 1. Open a terminal window
# ### 2. Go to the directory that contains your package directory.
# ### 3. Install the package
# R CMD INSTALL COSMONET_0.1.tar.gz
#
# ## From local source file
# ### Go to the link http://bioinfo.na.iac.cnr.it/cosmonet/ and download the package from the Repository - R package
# install.packages("COSMONET_0.1.0.tar.gz", repos = NULL, type="source")
COSMONET
to investigate cancer dataThe aim of this section is to provide the users with a general view of the package, including the main functions, and to describe the package’s outputs. First, we upload the COSMONET
R package.
library(COSMONET)
For illustration purposes, we use two examples of cancer data that are included in our package. The data processed by COSMONET
are microarray from GEO database and RNA-seq data from GDC data portal. The aim of the study is to identify the genes and pathways that play a crucial role in the severity of the disease related to the survival of cancer patients.
Users can download the normalized data and survival information directly from the Repository of the website (Repository/Data/Download Data).
For microarray data, we consider expression measurements of two different breast cancer datasets from GEO database: * GSE2034 dataset as training set (Affymetrix Human Genome U133A Array); * GSE2990 dataset as testing set (Affymetrix Human Genome U133A Array).
RMA
(Robust Multichip Average) function from affy R package for within arrays normalization and normalizeBetweenArrays
function from limma R package for between arrays normalization are used for the normalization of each single dataset. The annotation is made by using biomart R package after removing redundant probe sets, i.e. probe sets that measure different regions of the same gene. Generally, the pre-processing of the data are done by the users. The summary data is showed in the table below:
GEO Accession | Platform | Nr. of genes | Nr. of samples | Survival data |
---|---|---|---|---|
GSE2034 | Affymetrix HG-U133A | 13229 | 286 | Relapse Free Survival (RFS) |
GSE2990 | Affymetrix HG-U133A | 13229 | 187 | Relapse Free Survival (RFS) |
In this study, the survival information is relapse-free survival, i.e., the length of time after primary treatment for a cancer ends that the patient survives without any signs or symptoms of that cancer. In other words, measure the relapse-free survival means to see how well a new treatment works. Also called DFS disease-free survival, and RFS.
To investigate the datasets, we first load the data in an R session and then, we perform the normalization between the training and testing set by using NormalizeBetweenData()
function.
# Load the normalized data
load("data/BreastData_Normalized.RData")
# Load survival information: RFS - time (months) and status (alive=0 or death=1)
load("data/BreastData_Survival.RData")
To incorporate the a-priori biological information in the penalized regression model, we need to create the network matrix. In our analysis, to construct the molecular graph, we use CreateNetworkMatrix()
function. The disease code associated with breast cancer is (DOID:1612.
# Load the tissue specific data
library(tictoc)
tic("Run time")
load("data/BreastData_TissueSpecificEdge.rda")
NetworkMatrix <- CreateNetworkMatrix(
x1.norm,
repositoryDisease,
diseaseID = "DOID:1612",
topGenes = 500,
tissueSpecificEdge,
output = TRUE,
message = TRUE
)
## [1] "Your list contains 500 genes. 500 out of 500 are used in the network."
## [1] "There are 43865 edges left."
toc()
## Run time: 297.62 sec elapsed
# Network matrix
Omega <- as.matrix(NetworkMatrix$Omega)
dim(Omega)
## [1] 13229 13229
In this section we present the application of the three screening techniques. We use the function ScreeningMethod()
to identify the screened genes.
# Selection of the BMD-screened genes
cancer.genes <- NetworkMatrix$DiseaseGenes[,2]
screenVars.BMD <- ScreeningMethod(
x1.norm,
cancer.genes,
screening="bmd",
ranking=NULL,
thresh=NULL
)
## [1] "Number of BMD-screened genes: 437"
# Ranking of genes that are highly correlated with the patient's survival
load("data/BreastData_RankingGenes.RData")
head(rankingGenes)
## original.index symbol marg.coeff beta pvals
## 1 2449 CHRNA9 10.437125 -10.437125 2.030443e-02
## 2 3012 CXCL6 9.836375 -9.836375 2.122943e-02
## 3 9444 RACGAP1 8.915556 8.915556 1.484894e-07
## 4 7834 NR2E3 8.896399 -8.896399 8.931928e-05
## 5 2494 CKAP2 8.294733 8.294733 5.502541e-07
## 6 7414 MZB1 8.282109 -8.282109 2.372354e-04
The ranking of genes is obtained applying MarginalCoxRanking()
function. Users can download the file of the ranked genes directly from the website (Repository/Data/Download Data).
# Plots
par(mfrow = c(1, 2))
plot(rankingGenes$marg.coeff, rankingGenes$pvals, col = "blue", pch = 20, cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)
# Add the threshold line related to the 50th marginal coefficient (equal to 6.851106)
abline(v=rankingGenes[floor(dim(x1.norm)[1]/log(dim(x1.norm)[1])),3], col="red",lwd=2, lty=2)
plot(rankingGenes$pvals, 1:dim(x1.norm)[2], col = "blue", pch = 20, cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)
# Add the threshold line equal to 50 (i.e., the first 50 genes)
abline(h=floor(dim(x1.norm)[1]/log(dim(x1.norm)[1])), col="red",lwd=2, lty=2)
Figure 1: Plots based on the \(p\)-values and marginal coefficients of the ranked cancer genes
These plots give an idea of the number of screened genes that can be involved in the analysis to give significance to the results (in the first plot the significant genes are positioned to the right of the dashed red line; in the second one the genes are are below the dashed red line). In particular, we set the threshold equal to \(\lfloor n/ log n \rfloor=50\), i.e., the 50 first genes as DAD-screened genes.
# Selection of the DAD-screened genes
screenVars.DAD <- ScreeningMethod(
x1.norm,
cancer.genes,
screening="dad",
ranking=rankingGenes,
thresh=floor(dim(x1.norm)[1]/log(dim(x1.norm)[1]))
)
## [1] "Number of DAD-screened genes: 50"
# Union of BMD- and DAD-screeened genes
screenVars.BMD.DAD <- ScreeningMethod(
x1.norm,
cancer.genes,
screening="bmd+dad",
ranking=rankingGenes,
thresh=floor(dim(x1.norm)[1]/log(dim(x1.norm)[1]))
)
## [1] "Number of BMD+DAD-screened genes: 549"
We consider as resulting set the union of the BMD and DAD-screened genes.
Now we are able to apply the penalized methods using the subset of screened variables in order to identify the active regression coefficients and the optimal cutoff (based on the prognostic index) on the training set. We can get reproducible results by fixing \(alpha\) (network influence) and the elements of each folds in the 5-fold cross validation. We perform the complete procedure applying CosmonetTraining()
and SelectOptimalCutoff()
functions.
# Set alpha and lambda parameters
alpha <- 0.5
# lambda <- 10^(seq(0,-2,-0.05))
# For reproducibility of the results
library(caret)
set.seed(2021)
flds <- createFolds(y1$time, k = 5, list = TRUE, returnTrain = FALSE)
foldids <- rep(1,dim(y1)[1])
foldids[flds$Fold2] = 2
foldids[flds$Fold3] = 3
foldids[flds$Fold4] = 4
foldids[flds$Fold5] = 5
# BMD-screening + variable selection
# pdf(file="results/Train_KM_BMD_median.pdf",width=15, height=7)
tic("Run time")
fitTraining.BMD <- CosmonetTraining(
10,
x1.norm,
y1,
screenVars.BMD,
family="Cox",
penalty="Net",
Omega=Omega,
alpha=alpha,
nlambda = 50,
foldid = foldids,
selOptLambda="min",
optCutpoint="median"
)
Figure 2: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 3: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
toc()
## Run time: 11.303 sec elapsed
# dev.off()
# Regression coefficients
beta.BMD <- fitTraining.BMD$beta
# Number of non-zero beta
sum(beta.BMD!=0)
## [1] 40
# write.table(beta.BMD, file = "results/beta_BMD_breast.txt", row.names = T, col.names = F)
# Optimal parameters
pars.BMD <- data.frame(alpha=0.5,opt.lambda=fitTraining.BMD$opt.lambda,opt.cutoff=fitTraining.BMD$opt.cutoff)
pars.BMD
## alpha opt.lambda opt.cutoff
## 1 0.5 0.11899 -7.3697
# Summary tables
df.train.BMD <- fitTraining.BMD$df
head(df.train.BMD)
## sample PI time status 50%
## 1 GSM37048 -8.358512 87 0 2
## 2 GSM36945 -8.352637 92 0 2
## 3 GSM36984 -8.342822 138 0 2
## 4 GSM36852 -8.267668 107 0 2
## 5 GSM36883 -8.254200 88 0 2
## 6 GSM36845 -8.223623 54 0 2
summary.BMD <- fitTraining.BMD$summary
summary.BMD
## Low.Risk High.Risk cutoff p.value
## 50% 143 143 -7.3697 2.82e-14
# write.table(summary.BMD, file = "results/summary_BMD_breast.txt", row.names = T, col.names = T)
# DAD-screening + variable selection
# pdf(file="results/Train_KM_DAD_survCutpoint.pdf",width=15, height=7)
tic("Run time")
fitTraining.DAD <- CosmonetTraining(
10,
x1.norm,
y1,
screenVars.DAD,
family="Cox",
penalty="Net",
Omega=Omega,
alpha=alpha,
nlambda = 50,
foldid = foldids,
selOptLambda="min",
optCutpoint="survCutpoint"
)
Figure 4: Training Phase (DAD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 5: Training Phase (DAD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
toc()
## Run time: 1.624 sec elapsed
# dev.off()
# Regression coefficients
beta.DAD <- fitTraining.DAD$beta
# Number of non-zero beta
sum(beta.DAD!=0)
## [1] 32
# write.table(beta.DAD, file = "results/beta_DAD_breast.txt", row.names = T, col.names = F)
# Optimal parameters
pars.DAD <- data.frame(alpha=0.5,opt.lambda=fitTraining.DAD$opt.lambda,opt.cutoff=fitTraining.DAD$opt.cutoff)
pars.DAD
## alpha opt.lambda opt.cutoff
## 1 0.5 0.04740321 -10.15184
# Summary tables
df.train.DAD <- fitTraining.DAD$df
head(df.train.DAD)
## sample PI time status groupRisk id
## 1 GSM36933 -13.83397 110 0 2 1
## 2 GSM36817 -13.68776 113 0 2 2
## 3 GSM36851 -13.37413 100 0 2 3
## 4 GSM36915 -13.31251 97 0 2 4
## 5 GSM36863 -13.23986 98 0 2 5
## 6 GSM36794 -13.04705 87 0 2 6
summary.DAD <- fitTraining.DAD$summary
summary.DAD
## cutpoint statistic
## PI -10.15184 10.04115
# write.table(summary.DAD, file = "results/summary_DAD_breast.txt", row.names = T, col.names = T)
# BMD+DAD-screening + variable selection
# pdf(file="results/Train_KM_BMD_DAD_median.pdf",width=15, height=7)
tic("Run time")
fitTraining.BMD.DAD <- CosmonetTraining(
10,
x1.norm,
y1,
screenVars.BMD.DAD,
family="Cox",
penalty="Net",
Omega=Omega,
alpha=alpha,
nlambda = 50,
foldid = foldids,
selOptLambda="min",
optCutpoint="median"
)
Figure 6: Training Phase (BMD+DAD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 7: Training Phase (BMD+DAD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
toc()
## Run time: 10.647 sec elapsed
# dev.off()
# Regression coefficients
beta.BMD.DAD <- fitTraining.BMD.DAD$beta
sum(beta.BMD.DAD!=0)
## [1] 67
# write.table(beta.BMD.DAD, file = "results/beta_BMD_DAD_breast.txt", row.names = T, col.names = F)
# Optimal parameters
pars.BMD.DAD <- data.frame(alpha=0.5,opt.lambda=fitTraining.BMD.DAD$opt.lambda,opt.cutoff=fitTraining.BMD.DAD$opt.cutoff)
pars.BMD.DAD
## alpha opt.lambda opt.cutoff
## 1 0.5 0.08331165 -23.89154
# Summary tables
df.train.BMD.DAD <- fitTraining.BMD.DAD$df
head(df.train.BMD.DAD)
## sample PI time status 50%
## 1 GSM36868 -26.13637 123 0 2
## 2 GSM36863 -26.03310 98 0 2
## 3 GSM37059 -25.95335 83 0 2
## 4 GSM36984 -25.90174 138 0 2
## 5 GSM36780 -25.86187 84 0 2
## 6 GSM36821 -25.83046 88 0 2
summary.BMD.DAD <- fitTraining.BMD.DAD$summary
summary.BMD.DAD
## Low.Risk High.Risk cutoff p.value
## 50% 143 143 -23.89154 0
# write.table(summary.BMD.DAD, file = "results/summary_BMD_DAD_breast.txt", row.names = T, col.names = T)
We perform the testing phase by using CosmonetTesting()
and ValidationTest()
functions to validate the survival model.
# Validation test for BMD-screening + variable selection
opt.cutoff.BMD <- fitTraining.BMD$opt.cutoff
# pdf(file="results/Test_KM_BMD_breast.pdf",width=15, height=7)
tic("Run time")
fitTesting.BMD <- CosmonetTesting(
x2.norm,
y2,
screenVars.BMD,
beta.BMD,
opt.cutoff.BMD
)
Figure 8: Testing Phese (BMD-screening): (A) Kaplan-Maier curves and risk table and (B) distribution of the risk score PI
toc()
## Run time: 0.812 sec elapsed
# dev.off()
# Summary table
df.test.BMD <- fitTesting.BMD$df
head(df.test.BMD)
## sample PI time status groupRisk
## 44 GSM65361 -8.539579 88.273973 0 2
## 182 GSM65875 -8.457235 107.243836 0 2
## 169 GSM65862 -8.424892 142.323288 0 2
## 41 GSM65358 -8.384166 66.969863 0 2
## 25 GSM65342 -8.350312 2.169863 0 2
## 185 GSM65878 -8.348892 8.810959 1 2
# Validation test for DAD-screening + variable selection
opt.cutoff.DAD <- fitTraining.DAD$opt.cutoff
# pdf(file="results/Test_KM_DAD_breast.pdf",width=15, height=7)
tic("Run time")
fitTesting.DAD <- CosmonetTesting(
x2.norm,
y2,
screenVars.DAD,
beta.DAD,
opt.cutoff.DAD
)
Figure 9: Testing Phase (DAD-screening): (A) Kaplan-Maier curves and risk table and (B) distribution of the risk score PI
toc()
## Run time: 0.79 sec elapsed
# dev.off()
df.test.DAD <- fitTesting.DAD$df
head(df.test.DAD)
## sample PI time status groupRisk
## 102 GSM65794 -12.51275 103.95616 0 2
## 107 GSM65799 -12.40898 87.94521 0 2
## 172 GSM65865 -12.40538 144.72329 0 2
## 150 GSM65843 -12.29438 84.23014 1 2
## 14 GSM65331 -11.85027 114.01644 0 2
## 178 GSM65871 -11.82383 104.84384 0 2
# Validation test for BMD+DAD-screening + variable selection
opt.cutoff.BMD.DAD <- fitTraining.BMD.DAD$opt.cutoff
# pdf(file="results/Test_KM_BMD_DAD_breast.pdf",width=15, height=7)
tic("Run time")
fitTesting.BMD.DAD <- CosmonetTesting(
x2.norm,
y2,
screenVars.BMD.DAD,
beta.BMD.DAD,
opt.cutoff.BMD.DAD
)
Figure 10: Testing Phase (BMD+DAD-screening): (A) Kaplan-Maier curves and risk table and (B) distribution of the risk score PI
toc()
## Run time: 0.775 sec elapsed
# dev.off()
df.test.BMD.DAD <- fitTesting.BMD.DAD$df
head(df.test.BMD.DAD)
## sample PI time status groupRisk
## 169 GSM65862 -26.11466 142.3233 0 2
## 172 GSM65865 -25.99728 144.7233 0 2
## 178 GSM65871 -25.71374 104.8438 0 2
## 171 GSM65864 -25.70033 120.5260 1 2
## 29 GSM65346 -25.55507 18.6411 0 2
## 161 GSM65854 -25.54970 147.1890 0 2
We use venn diagrams to show the relation between low-risk groups (or high-risk groups) of each screening method by considering the testing set. We use VennPlot()
function.
# Low-risk groups
lowRisk.BMD <- as.character(df.test.BMD$sample[which(df.test.BMD$groupRisk==2)])
lowRisk.DAD <- as.character(df.test.DAD$sample[which(df.test.DAD$groupRisk==2)])
lowRisk.BMD.DAD <- as.character(df.test.BMD.DAD$sample[which(df.test.BMD.DAD$groupRisk==2)])
# High-risk groups
highRisk.BMD <- as.character(df.test.BMD$sample[which(df.test.BMD$groupRisk==1)])
highRisk.DAD <- as.character(df.test.DAD$sample[which(df.test.DAD$groupRisk==1)])
highRisk.BMD.DAD <- as.character(df.test.BMD.DAD$sample[which(df.test.BMD.DAD$groupRisk==1)])
# Display Venn Diagram
vennPlot1 <- VennPlot(lowRisk.BMD, lowRisk.DAD, lowRisk.BMD.DAD, "LR-Group")
Figure 11: Venn diagramm of low-risk groups on testing set
## NULL
# Display Venn Diagram
vennPlot2 <- VennPlot(highRisk.BMD, highRisk.DAD, highRisk.BMD.DAD, "HR-Group")
Figure 12: Venn diagramm of high-risk groups on testing set
## NULL
We use HeatmapSurvGroup()
function to visualize the heat-map of the gene expression matrix on the training and testing set, respectively, by taking into account the screened genes and the prognostic information (high- and low-risk group).
Figure 13: Heat-maps of the BMD-genes on training and testing set
Figure 14: Heat-maps of the DAD-genes on training and testing set
Figure 15: Heat-maps of the BMD+DAD-genes on training and testing set
Figure 13-15 show the Z-score expression matrix of the BMD-, DAD- and BMD+DAD-genes in the training and testing, respectively. The genes in the horizontal direction are clustered in the same order in both sets. Z-scores of the gene expression measurements are used. Z-scores larger than 3.5 were set to 3.5 and Z-scores smaller than -3.5 were set to -3.5. Dark pink color indicates a high level of expression in breast cancer and green color indicates a low level of expression. The patients are divided in high-risk and low-risk groups (i.e., patients with bad and good prognosis).
We perform a correlation analysis on the training and testing set to further investigate the relation between prognostic indices PIs. We use corrPlot()
function.
# Renames columns
colnames(df.train.BMD)[5] <- "groupRisk.BMD"
colnames(df.train.DAD)[5] <- "groupRisk.DAD"
colnames(df.train.BMD.DAD)[5] <- "groupRisk.BMD.DAD"
# Correlation plots of prognostic index PI
corrPlot1 <- CorrPlot(df1 = df.train.BMD,
df2 = df.train.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.DAD",
method="spearman",
name1 = "BMD",
name2 = "DAD",
set = "Training set"
)
corrPlot2 <- CorrPlot(df1 = df.train.BMD,
df2 = df.train.BMD.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.BMD.DAD",
method="spearman",
name1 = "BMD",
name2 = "BMD.DAD",
set = "Training set"
)
# Visualization
ggarrange(corrPlot1,corrPlot2, nrow = 1, ncol = 2, common.legend = T, font.label=list(color="black",size=14))
Figure 16: PI-correlation plots on training set
# ggsave(file="results/corr_train_breast.pdf",width=15, height=7)
# Rename columns
colnames(df.test.BMD)[5] <- "groupRisk.BMD"
colnames(df.test.DAD)[5] <- "groupRisk.DAD"
colnames(df.test.BMD.DAD)[5] <- "groupRisk.BMD.DAD"
# Correlation plots of prognostic index PI
corrPlot1 <- CorrPlot(df1 = df.test.BMD,
df2 = df.test.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.DAD",
method="spearman",
name1 = "BMD",
name2 = "DAD",
set = "Testing set"
)
corrPlot2 <- CorrPlot(df1 = df.test.BMD,
df2 = df.test.BMD.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.BMD.DAD",
method="spearman",
name1 = "BMD",
name2 = "BMD.DAD",
set = "Testing set"
)
# Visualization
ggarrange(corrPlot1,corrPlot2, nrow = 1, ncol = 2, common.legend = T, font.label=list(color="black",size=14))
Figure 17: PI-correlation plots on testing set
# ggsave(file="results/corr_test_breast.pdf",width=15, height=7)
COSMONET
includes an RMarkdown script to generate an interactive dashboard of KEGG pathways networks. We report the network created running the proposed model using the BMD+DAD-screened genes. The dashboard contains two tabs, one for the not-isolated genes and one for the full network. Each node corresponds to a gene and the edges represent the KEGG pathways shared by the linked genes. Different node colors are used to show a correlation measure between the genes and the disease under investigation as follows:
Therefore, red nodes represent the BMD contribution to the signature and blue and white nodes the DAD contribution, not yet captured in the BMD list. Note that some of the red genes might also be retrieved from the data-driven screening. However, in this context, we want to underline and make sense of the novel information not yet considered.
We implemented the GenerateNetwork()
function to visualize the interactive networks. The colours of the nodes and the ID of the disease to investigate can be changed by the user passing the new values as function’s parameters in RUNCosmonetDashboard.Rmd
loaded in the same directory as the vignetta.
# Build a data frame composed by two columns: gene names and beta coefficients
ListOfGenes <- data.frame(Symbol=rownames(beta.BMD.DAD), coefficients=beta.BMD.DAD[,1])
rownames(ListOfGenes) <-1:dim(ListOfGenes)[1]
# write.table(ListOfGenes, "ListOfGenes.txt", sep = "\t", row.names = F, col.names = T)
tic("Run time")
networkBreast <- GenerateNetwork(
ListOfGenes = ListOfGenes,
header = TRUE,
diseaseID = "DOID:1612",
nodesCols = c('#EFCFD4','#D9E3FC','#FFFFFF')
)
toc()
The function will automatically open a new html page to visualise the dashboard. An example of the final dashboard is shown in the figure below (and it can be accessed at the following link).
knitr::include_graphics("COSMONET_NETWORK.pdf")
Figure 18: Subnetwork related to the application of the BMD+DAD-screening and network method
The GenerateNetwork()
function can also be used as a standalone tool to create a KEGG pathways network using an external list of genes and beta coefficients provided by the user.
For RNASeq data we consider TCGA-LAUD (Lung adenocarcinoma) data from the GDC Data Portal. The data are first downloaded from LinkedOmics portal where they are already pre-processed and normalized (Illumina HiSeq platform, Gene-level, RPKM). Then, we split randomly the entire dataset between training and testing sets (\(50\%\) train and \(50\%\) test) in order to perform the between normalization. We use splitData()
and NormalizeBetweenData()
functions.
The summary of the data is showed in the table below:
GDC Accession | Platform | Nr. of genes | Nr. of samples | Survival data |
---|---|---|---|---|
TCGA-LAUD | Illumina HiSeq | 19988 | 492 | Overall Survival |
TCGA-LAUD - train | Illumina HiSeq | 19988 | 246 | Overall Survival |
TCGA-LAUD - test | Illumina HiSeq | 19988 | 246 | Overall Survival |
Users can download the data directly from the website (Repository/Data/Download Data).
# Go to the directory that contains the data
# Load normalized data
load("data/LungData_Normalized.RData")
# Load survival information: RFS - time (months) and status (alive=0 or death=1)
load("data/LungData_Survival.RData")
As before, to include the a-priori biological network in the model, we build the network matrix using CreateNetworkMatrix()
. The disease code is DOID:1324.
# Load tissue specific data
tic("Run time")
load("data/LungData_TissueSpecificEdge.rda")
NetworkMatrix <- CreateNetworkMatrix(
x1.norm,
repositoryDisease,
diseaseID = "DOID:1324",
topGenes = 500,
tissueSpecificEdge,
output = TRUE,
message = TRUE
)
## [1] "Your list contains 499 genes. 499 out of 499 are used in the network."
## [1] "There are 26310 edges left."
toc()
## Run time: 408.493 sec elapsed
# Network matrix
Omega <- as.matrix(NetworkMatrix$Omega)
dim(Omega)
## [1] 19988 19988
In this section, we illustrate the application of the variable screening approaches. We use ScreeningMethod()
function.
# Selection of the BMD-screened genes
cancer.genes <- NetworkMatrix$DiseaseGenes[,2]
screenVars.BMD <- ScreeningMethod(
x1.norm,
cancer.genes,
screening="bmd",
ranking=NULL,
thresh=NULL
)
## [1] "Number of BMD-screened genes: 482"
# Ranking of genes that are highly correlated with the patient's survival
load("data/LungData_RankingGenes.RData")
head(rankingGenes)
## original.index symbol marg.coeff beta pvals
## 1 11841 NOBOX 155.24804 -155.24804 0.9953518
## 2 12291 OR2A12 84.26860 -84.26860 0.9953640
## 3 12319 OR2J2 84.10697 -84.10697 0.9950270
## 4 2263 C20orf185 83.42011 -83.42011 0.9961920
## 5 9797 LOC285370 75.07351 -75.07351 0.9957503
## 6 12387 OR52B4 67.00438 -67.00438 0.9945741
Users can download the file of the ranked genes directly from the website (Repository/Data/Download Data).
# Plots
par(mfrow = c(1, 2))
plot(rankingGenes$marg.coeff, rankingGenes$pvals, col = "blue", pch = 20, cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)
# Add the threshold line related to the 50th marginal coefficient (equal to 6.851106)
abline(v=rankingGenes[floor(dim(x1.norm)[1]/log(dim(x1.norm)[1])),3], col="red",lwd=2, lty=2)
index <- colnames(x1.norm)%in%rankingGenes$symbol
plot(rankingGenes$pvals[index], 1:dim(x1.norm[,index])[2], col = "blue", pch = 20, cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)
# Add the threshold line equal to 50 (i.e., the first 50 genes)
abline(h=floor(dim(x1.norm[,index])[1]/log(dim(x1.norm[,index])[1])), col="red",lwd=2, lty=2)
Figure 19: Plots based on the \(p\)-values and marginal coefficients of the ranked cancer genes
For this esxample, we set the threshold equal to 70. In other words, we consider the first 70 genes as DAD-screened genes.
# Selection of the DAD-screened genes
screenVars.DAD <- ScreeningMethod(
x1.norm,
cancer.genes,
screening="dad",
ranking=rankingGenes,
thresh= 70 # floor(dim(x1.norm)[1]/log(dim(x1.norm)[1]))
)
## [1] "Number of DAD-screened genes: 70"
# Union of BMD- and DAD-screeened genes
screenVars.BMD.DAD <- ScreeningMethod(
x1.norm,
cancer.genes,
screening="bmd+dad",
ranking=rankingGenes,
thresh= 70 # floor(dim(x1.norm)[1]/log(dim(x1.norm)[1]))
)
## [1] "Number of BMD+DAD-screened genes: 567"
We consider as resulting set the union of the BMD and DAD-screened genes.
In this section, we perform the variable selection setting \(\alpha=0.1\).
# Set alpha and lambda parameters
alpha <- 0.1
# Perform five-fold cross-validation to control the randomness for reproducibility of the results
library(caret)
set.seed(2021)
flds <- createFolds(y1$time, k = 5, list = TRUE, returnTrain = FALSE)
foldids <- rep(1,dim(y1)[1])
foldids[flds$Fold2] = 2
foldids[flds$Fold3] = 3
foldids[flds$Fold4] = 4
foldids[flds$Fold5] = 5
# BMD-screening + variable selection
# pdf(file="results/Train_KM_BMD_median.pdf",width=15, height=7)
tic("Run time")
fitTraining.BMD <- CosmonetTraining(
10,
x1.norm,
y1,
screenVars.BMD,
family="Cox",
penalty="Net",
Omega=Omega,
alpha=alpha,
nlambda = 50,
foldid = foldids,
selOptLambda="min",
optCutpoint="median"
)
Figure 20: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 21: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
toc()
## Run time: 6.928 sec elapsed
# dev.off()
# Regression coefficients
beta.BMD <- fitTraining.BMD$beta
# Number of non-zero beta
sum(beta.BMD!=0)
## [1] 25
# write.table(beta.BMD, "results1/beta_BMD_lung.txt", row.names = T, col.names = F)
# Optimal parameters
pars.BMD <- data.frame(alpha=0.1,opt.lambda=fitTraining.BMD$opt.lambda,opt.cutoff=fitTraining.BMD$opt.cutoff)
pars.BMD
## alpha opt.lambda opt.cutoff
## 1 0.1 0.7717434 2.379353
# Summary tables
df.train.BMD <- fitTraining.BMD$df
head(df.train.BMD)
## sample PI time status 50%
## 1 TCGA.86.A4P8 1.804026 805 0 2
## 2 TCGA.75.6206 1.928616 2590 0 2
## 3 TCGA.S2.AA1A 1.935255 513 0 2
## 4 TCGA.97.8552 1.956918 626 0 2
## 5 TCGA.MP.A4TH 1.966103 741 0 2
## 6 TCGA.55.8206 1.966330 888 0 2
summary.BMD <- fitTraining.BMD$summary
summary.BMD
## Low.Risk High.Risk cutoff p.value
## 50% 123 123 2.379353 0.0001381
# write.table(summary.BMD, "results1/summary_BMD_lung_median.txt", row.names = T, col.names = T)
# DAD-screening + variable selection
# pdf(file="results/Train_KM_DAD_survCutpoint.pdf",width=15, height=7)
tic("Run time")
fitTraining.DAD <- CosmonetTraining(
10,
x1.norm,
y1,
screenVars.DAD,
family="Cox",
penalty="Net",
Omega=Omega,
alpha=alpha,
nlambda = 50,
foldid = foldids,
selOptLambda="min",
optCutpoint="survCutpoint"
)
Figure 22: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 23: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
toc()
## Run time: 2.08 sec elapsed
# dev.off()
# Regression coefficients
beta.DAD <- fitTraining.DAD$beta
# Number of non-zero beta
sum(beta.DAD!=0)
## [1] 59
# write.table(beta.DAD, "results1/beta_DAD_lung.txt", row.names = T, col.names = F)
# Optimal parameters
pars.DAD <- data.frame(alpha=0.1,opt.lambda=fitTraining.DAD$opt.lambda,opt.cutoff=fitTraining.DAD$opt.cutoff)
pars.DAD
## alpha opt.lambda opt.cutoff
## 1 0.1 0.04620194 5.65117
# Summary tables
df.train.DAD <- fitTraining.DAD$df
head(df.train.DAD)
## sample PI time status groupRisk id
## 1 TCGA.NJ.A4YQ -0.7283181 1432 0 2 1
## 2 TCGA.55.6987 0.4168007 2137 0 2 2
## 3 TCGA.78.7163 1.4052194 7248 0 2 3
## 4 TCGA.38.4626 1.6160155 3674 0 2 4
## 5 TCGA.55.6969 2.0784045 1239 0 2 5
## 6 TCGA.75.6206 2.2249807 2590 0 2 6
summary.DAD <- fitTraining.DAD$summary
summary.DAD
## cutpoint statistic
## PI 5.65117 9.767933
# write.table(summary.DAD, "results1/summary_DAD_lung_survCutpoint.txt", row.names = T, col.names = T)
# (BMD+DAD-screening + variable selection
# pdf(file="results/Train_KM_BMD_DAD_minPValue.pdf",width=15, height=7)
tic("Run time")
fitTraining.BMD.DAD <- CosmonetTraining(
10,
x1.norm,
y1,
screenVars.BMD.DAD,
family="Cox",
penalty="Net",
Omega=Omega,
alpha=alpha,
nlambda = 50,
foldid = foldids,
selOptLambda="min",
optCutpoint="minPValue"
)
Figure 24: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 25: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 26: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 27: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 28: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 29: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 30: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 31: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 32: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 33: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 34: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 35: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 36: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
Figure 37: Training Phase (BMD-screening): cross-validation plot , Kaplan-Meier curves and distribution plot
toc()
## Run time: 20.772 sec elapsed
# dev.off()
beta.BMD.DAD <- fitTraining.BMD.DAD$beta
sum(beta.BMD.DAD!=0)
## [1] 54
# write.table(beta.BMD.DAD, "results1/beta_BMD_DAD_lung.txt", row.names = T, col.names = F)
# Optimal parameters
pars.BMD.DAD <- data.frame(alpha=0.1,opt.lambda=fitTraining.BMD.DAD$opt.lambda,opt.cutoff=fitTraining.BMD.DAD$opt.cutoff)
pars.BMD.DAD
## alpha opt.lambda opt.cutoff
## 1 0.1 0.531967 3.071756
# Summary tables
df.train.BMD.DAD <- fitTraining.BMD.DAD$df
head(df.train.BMD.DAD)
## sample PI time status 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75%
## 1 TCGA.55.8097 1.871829 476 0 2 2 2 2 2 2 2 2 2 2 2
## 2 TCGA.86.A4P8 1.883202 805 0 2 2 2 2 2 2 2 2 2 2 2
## 3 TCGA.55.8206 1.971239 888 0 2 2 2 2 2 2 2 2 2 2 2
## 4 TCGA.50.8457 1.993326 1125 0 2 2 2 2 2 2 2 2 2 2 2
## 5 TCGA.86.8056 2.016668 139 0 2 2 2 2 2 2 2 2 2 2 2
## 6 TCGA.MP.A4TH 2.018201 741 0 2 2 2 2 2 2 2 2 2 2 2
## 80%
## 1 2
## 2 2
## 3 2
## 4 2
## 5 2
## 6 2
summary.BMD.DAD <- fitTraining.BMD.DAD$summary
summary.BMD.DAD
## Low.Risk High.Risk cutoff p.value
## 25% 62 184 2.559594 4.306e-08
## 30% 74 172 2.611505 2.410e-09
## 35% 86 160 2.657183 2.095e-10
## 40% 98 148 2.720584 4.211e-11
## 45% 111 135 2.776697 2.694e-10
## 50% 123 123 2.823742 5.474e-10
## 55% 135 111 2.883202 1.031e-11
## 60% 147 99 2.967608 4.441e-16
## 65% 160 86 3.034267 3.331e-15
## 70% 172 74 3.071756 1.110e-16
## 75% 184 62 3.169746 0.000e+00
## 80% 196 50 3.225793 2.220e-16
# write.table(summary.BMD.DAD, "results1/summary_BMD_DAD_lung_minPValue.txt", row.names = T, col.names = T)
We perform the survival analysis using CosmonetTesting()
.
# Validation test for BMD-screening + variable selection
opt.cutoff.BMD <- fitTraining.BMD$opt.cutoff
# pdf(file="results/Test_KM_BMD_lung_median.pdf",width=15, height=7)
tic("Run time")
fitTesting.BMD <- CosmonetTesting(
x2.norm,
y2,
screenVars.BMD,
beta.BMD,
opt.cutoff.BMD
)
Figure 38: Testing Phese (BMD-screening): (A) Kaplan-Maier curves and risk table and (B) distribution of the risk score PI
toc()
## Run time: 0.879 sec elapsed
# dev.off()
# Summary table
df.test.BMD <- fitTesting.BMD$df
head(df.test.BMD)
## sample PI time status groupRisk
## 29 TCGA.44.6148 1.792995 704 0 2
## 52 TCGA.49.AARR 1.883773 4992 0 2
## 64 TCGA.50.5942 1.885215 1847 0 2
## 228 TCGA.L4.A4E6 1.902645 435 0 2
## 192 TCGA.91.8497 1.908321 434 1 2
## 111 TCGA.55.8512 1.920400 607 1 2
# Validation test for DAD-screening + variable selection
opt.cutoff.DAD <- fitTraining.DAD$opt.cutoff
# pdf(file="results/Test_KM_DAD_lung_survCutpoint.pdf",width=15, height=7)
tic("Run time")
fitTesting.DAD <- CosmonetTesting(
x2.norm,
y2,
screenVars.DAD,
beta.DAD,
opt.cutoff.DAD
)
Figure 39: Testing Phase (DAD-screening): (A) Kaplan-Maier curves and risk table and (B) distribution of the risk score PI
toc()
## Run time: 0.868 sec elapsed
# dev.off()
df.test.DAD <- fitTesting.DAD$df
head(df.test.DAD)
## sample PI time status groupRisk
## 65 TCGA.50.5946 -9.1809947 1617 0 2
## 115 TCGA.55.A490 -8.9224818 99 1 2
## 233 TCGA.L9.A7SV -4.8845769 565 0 2
## 236 TCGA.MN.A4N4 -0.8783555 1175 0 2
## 70 TCGA.50.7109 -0.5137000 308 1 2
## 80 TCGA.55.6968 0.3502241 1293 1 2
# Validation test for BMD+DAD-screening + variable selection
opt.cutoff.BMD.DAD <- fitTraining.BMD.DAD$opt.cutoff
# pdf(file="results/Test_KM_BMD_DAD_lung_minPValue.pdf",width=15, height=7)
tic("Run time")
fitTesting.BMD.DAD <- CosmonetTesting(
x2.norm,
y2,
screenVars.BMD.DAD,
beta.BMD.DAD,
opt.cutoff.BMD.DAD
)
Figure 40: Testing Phase (BMD+DAD-screening): (A) Kaplan-Maier curves and risk table and (B) distribution of the risk score PI
toc()
## Run time: 0.977 sec elapsed
# dev.off()
df.test.BMD.DAD <- fitTesting.BMD.DAD$df
head(df.test.BMD.DAD)
## sample PI time status groupRisk
## 233 TCGA.L9.A7SV 1.679135 565 0 2
## 64 TCGA.50.5942 1.860430 1847 0 2
## 224 TCGA.99.AA5R 1.958567 658 0 2
## 228 TCGA.L4.A4E6 1.969429 435 0 2
## 26 TCGA.44.5645 1.984508 852 0 2
## 218 TCGA.97.A4M2 2.043961 624 0 2
We use VennPlot()
function to show the relation between low-risk groups (or high-risk groups).
# High-risk groups
highRisk.BMD <- as.character(df.test.BMD$sample[which(df.test.BMD$groupRisk==1)])
highRisk.DAD <- as.character(df.test.DAD$sample[which(df.test.DAD$groupRisk==1)])
highRisk.BMD.DAD <- as.character(df.test.BMD.DAD$sample[which(df.test.BMD.DAD$groupRisk==1)])
# Low-risk groups
lowRisk.BMD <- as.character(df.test.BMD$sample[which(df.test.BMD$groupRisk==2)])
lowRisk.DAD <- as.character(df.test.DAD$sample[which(df.test.DAD$groupRisk==2)])
lowRisk.BMD.DAD <- as.character(df.test.BMD.DAD$sample[which(df.test.BMD.DAD$groupRisk==2)])
# Display Venn Diagram
vennPlot1 <- VennPlot(lowRisk.BMD, lowRisk.DAD, lowRisk.BMD.DAD, "LR-Group")
Figure 41: Venn diagramm of low-risk groups
## NULL
# Display Venn Diagram
vennPlot2 <- VennPlot(highRisk.BMD, highRisk.DAD, highRisk.BMD.DAD, "HR-Group")
Figure 42: Venn diagramm of high-risk groups
## NULL
We use HeatmapSurvGroup()
function to visualize the heat-map of the gene expression matrix on the training and testing dataset.
Figure 43: Heat-maps of the BMD-genes on training and testing set
Figure 44: Heat-maps of the DAD-genes on training and testing set
Figure 45: Heat-maps of the BMD+DAD-genes on training and testing set
Figure 42-44 show the Z-score expression matrix of the BMD-, DAD- and BMD+DAD-genes in the training and testing, respectively.
We use corrPlot()
function to perform the PI-correlation analysis on the training and testing set.
# Renames columns
colnames(df.train.BMD)[5] <- "groupRisk.BMD"
colnames(df.train.DAD)[5] <- "groupRisk.DAD"
colnames(df.train.BMD.DAD)[5] <- "groupRisk.BMD.DAD"
# Correlation plots of prognostic index PI
corrPlot1 <- CorrPlot(df1 = df.train.BMD,
df2 = df.train.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.DAD",
method="spearman",
name1 = "BMD",
name2 = "DAD",
set = "Training set"
)
corrPlot2 <- CorrPlot(df1 = df.train.BMD,
df2 = df.train.BMD.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.BMD.DAD",
method="spearman",
name1 = "BMD",
name2 = "BMD.DAD",
set = "Training set"
)
# Visualization
ggarrange(corrPlot1,corrPlot2, nrow = 1, ncol = 2, common.legend = T)
Figure 46: PI-correlation plots on training set
# ggsave(file="results/corr_train_lung.pdf",width=15, height=7)
# Rename columns
colnames(df.test.BMD)[5] <- "groupRisk.BMD"
colnames(df.test.DAD)[5] <- "groupRisk.DAD"
colnames(df.test.BMD.DAD)[5] <- "groupRisk.BMD.DAD"
# Correlation plots of prognostic index PI
corrPlot1 <- CorrPlot(df1 = df.test.BMD,
df2 = df.test.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.DAD",
method="spearman",
name1 = "BMD",
name2 = "DAD",
set = "Testing set"
)
corrPlot2 <-CorrPlot(df1 = df.test.BMD,
df2 = df.test.BMD.DAD,
groupRisk1 = "groupRisk.BMD",
groupRisk2 = "groupRisk.BMD.DAD",
method="spearman",
name1 = "BMD",
name2 = "BMD.DAD",
set = "Testing set"
)
# Visualization
ggarrange(corrPlot1,corrPlot2, nrow = 1, ncol = 2, common.legend = T)
Figure 47: PI-correlation plots on testing set
ggsave(file="results/corr_test_lung.pdf",width=15, height=7)
Finally, the function GenerateNetwork()
can be used to visualise the KEGG pathways shared between the gene signatures selected by the screening methods.
The example below generates the dashboard to investigate the KEGG pathways for the genes selected by the BMD+DAD screening.
# # Build a data frame composed by two columns: gene names and beta coefficients
# ListOfGenes <- data.frame(Symbol=rownames(beta.BMD.DAD), coefficients=beta.BMD.DAD[,1])
# rownames(ListOfGenes) <-1:dim(ListOfGenes)[1]
# # write.table(ListOfGenes, "ListOfGenes.txt", sep = "\t", row.names = F, col.names = T)
#
# tic("Run time")
# networkLung <- GenerateNetwork(
# ListOfGenes = ListOfGenes,
# header = TRUE,
# diseaseID = "DOID:1324",
# nodesCols = c('#EFCFD4','#D9E3FC','#FFFFFF')
# )
# toc()
COSMONET
is a powerful package that can be used to identify new cancer biomarkers through screening and Cox-network approaches. Here, we presented two examples to show simple applications of the package. As reported in Iuliano et al., 2018, the framework implemented in COSMONET
provides many more advantages than the ones shown above. Among others: (i) the screening-network procedures outperforms classical penalized Cox-regression methods, allowing to select sub-network signatures, (ii) the DAD-screening has good performance in absence of any previous information but it is sub optimal with respect to the BMD-screening, and (iii) BMD+DAD screening improves both BMD and DAD the performance of COSMONET
. These results may improve prediction accuracy and guide clinicians in choosing specific therapies for better survival and in avoiding unnecessary treatments.
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] visNetwork_2.1.0 randomcoloR_1.1.0.1 graph_1.70.0
## [4] BiocGenerics_0.38.0 RColorBrewer_1.1-2 patchwork_1.1.1
## [7] pheatmap_1.0.12 ggplotify_0.1.0 VennDiagram_1.7.1
## [10] futile.logger_1.4.3 miscset_1.1.0 gridExtra_2.3
## [13] survminer_0.4.9 ggpubr_0.4.0 survival_3.2-13
## [16] pbmcapply_1.5.0 plyr_1.8.6 APML0_0.10
## [19] Matrix_1.3-4 caret_6.0-90 lattice_0.20-45
## [22] ggplot2_3.3.5 magic_1.5-9 abind_1.4-5
## [25] igraph_1.2.9 biomaRt_2.48.3 tictoc_1.0.1
## [28] COSMONET_0.1.0 BiocStyle_2.20.2
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.2 tidyselect_1.1.1 htmlwidgets_1.5.4
## [4] RSQLite_2.2.8 AnnotationDbi_1.54.1 flexdashboard_0.5.2
## [7] Rtsne_0.15 pROC_1.18.0 devtools_2.4.3
## [10] maxstat_0.7-25 munsell_0.5.0 codetools_0.2-18
## [13] future_1.23.0 withr_2.4.3 colorspace_2.0-2
## [16] Biobase_2.52.0 filelock_1.0.2 highr_0.9
## [19] knitr_1.36 rstudioapi_0.13 stats4_4.1.1
## [22] ggsignif_0.6.3 listenv_0.8.0 labeling_0.4.2
## [25] GenomeInfoDbData_1.2.6 KMsurv_0.1-5 bit64_4.0.5
## [28] farver_2.1.0 rprojroot_2.0.2 parallelly_1.29.0
## [31] vctrs_0.3.8 generics_0.1.1 lambda.r_1.2.4
## [34] ipred_0.9-12 xfun_0.28 BiocFileCache_2.0.0
## [37] R6_2.5.1 markdown_1.1 GenomeInfoDb_1.28.4
## [40] bitops_1.0-7 cachem_1.0.6 gridGraphics_0.5-1
## [43] assertthat_0.2.1 scales_1.1.1 nnet_7.3-16
## [46] gtable_0.3.0 globals_0.14.0 processx_3.5.2
## [49] timeDate_3043.102 rlang_0.4.12 splines_4.1.1
## [52] rstatix_0.7.0 ModelMetrics_1.2.2.2 broom_0.7.10
## [55] BiocManager_1.30.16 yaml_2.2.1 reshape2_1.4.4
## [58] backports_1.4.0 gridtext_0.1.4 tools_4.1.1
## [61] lava_1.6.10 usethis_2.1.3 bookdown_0.24
## [64] ellipsis_0.3.2 jquerylib_0.1.4 sessioninfo_1.2.1
## [67] Rcpp_1.0.7 progress_1.2.2 zlibbioc_1.38.0
## [70] purrr_0.3.4 RCurl_1.98-1.5 ps_1.6.0
## [73] prettyunits_1.1.1 rpart_4.1-15 cowplot_1.1.1
## [76] S4Vectors_0.30.2 zoo_1.8-9 cluster_2.1.2
## [79] exactRankTests_0.8-34 fs_1.5.1 magrittr_2.0.1
## [82] data.table_1.14.2 futile.options_1.0.1 mvtnorm_1.1-3
## [85] pkgload_1.2.4 hms_1.1.1 evaluate_0.14
## [88] xtable_1.8-4 XML_3.99-0.8 IRanges_2.26.0
## [91] testthat_3.1.1 compiler_4.1.1 tibble_3.1.6
## [94] V8_3.6.0 crayon_1.4.2 htmltools_0.5.2
## [97] mgcv_1.8-38 tidyr_1.1.4 ggtext_0.1.1
## [100] lubridate_1.8.0 DBI_1.1.1 formatR_1.11
## [103] dbplyr_2.1.1 MASS_7.3-54 rappdirs_0.3.3
## [106] car_3.0-12 cli_3.1.0 gower_0.2.2
## [109] pkgconfig_2.0.3 km.ci_0.5-2 recipes_0.1.17
## [112] xml2_1.3.3 foreach_1.5.1 bslib_0.3.1
## [115] XVector_0.32.0 prodlim_2019.11.13 yulab.utils_0.0.4
## [118] stringr_1.4.0 callr_3.7.0 digest_0.6.29
## [121] Biostrings_2.60.2 rmarkdown_2.11 survMisc_0.5.5
## [124] curl_4.3.2 lifecycle_1.0.1 nlme_3.1-153
## [127] jsonlite_1.7.2 carData_3.0-4 desc_1.4.0
## [130] fansi_0.5.0 pillar_1.6.4 KEGGREST_1.32.0
## [133] fastmap_1.1.0 httr_1.4.2 pkgbuild_1.2.1
## [136] glue_1.5.1 remotes_2.4.2 png_0.1-7
## [139] iterators_1.0.13 bit_4.0.4 class_7.3-19
## [142] stringi_1.7.6 sass_0.4.0 blob_1.2.2
## [145] memoise_2.0.1 dplyr_1.0.7 future.apply_1.8.1