generateBulkCellMatrix
R/utils.R
showProbPlot.Rd
Show distribution plots of the cell proportions generated by
generateBulkCellMatrix
. These frequencies will determine the
proportion of different cell types used during the simulation of pseudo-bulk
RNA-Seq samples. There are 6 subsets of proportions generated by different
approaches that can be visualized in three ways: box plots, violin plots and
lines plots. You can also plot the probabilities based on the number of
different cell types present in the samples by setting type.plot =
'nCellTypes'
.
showProbPlot(object, type.data, set, type.plot = "boxplot")
DigitalDLSorter
object with
prob.cell.types
slot with plot
slot.
Subset of data to show: train
or test
.
Integer determining which of the 6 different subsets to display.
Character determining which type of visualization to
display. It can be 'boxplot'
, 'violinplot'
, 'linesplot'
or
'ncelltypes'
. See Description for more information.
A ggplot object.
These plots are only for diagnostic purposes. This is the reason because they are generated without any parameter introduced by the user.
# simulating data
set.seed(123) # reproducibility
sce <- SingleCellExperiment::SingleCellExperiment(
assays = list(
counts = matrix(
rpois(100, lambda = 5), nrow = 40, ncol = 30,
dimnames = list(paste0("Gene", seq(40)), paste0("RHC", seq(30)))
)
),
colData = data.frame(
Cell_ID = paste0("RHC", seq(30)),
Cell_Type = sample(x = paste0("CellType", seq(4)), size = 30,
replace = TRUE)
),
rowData = data.frame(
Gene_ID = paste0("Gene", seq(40))
)
)
DDLS <- createDDLSobject(
sc.data = sce,
sc.cell.ID.column = "Cell_ID",
sc.gene.ID.column = "Gene_ID",
sc.filt.genes.cluster = FALSE,
sc.log.FC = FALSE
)
#> === Bulk RNA-seq data not provided
#> === Processing single-cell data
#> - Filtering features:
#> - Selected features: 40
#> - Discarded features: 0
#>
#> === No mitochondrial genes were found by using ^mt- as regrex
#>
#> === Final number of dimensions for further analyses: 40
probMatrix <- data.frame(
Cell_Type = paste0("CellType", seq(4)),
from = c(1, 1, 1, 30),
to = c(15, 15, 50, 70)
)
DDLS <- generateBulkCellMatrix(
object = DDLS,
cell.ID.column = "Cell_ID",
cell.type.column = "Cell_Type",
prob.design = probMatrix,
num.bulk.samples = 60
)
#>
#> === The number of bulk RNA-Seq samples that will be generated is equal to 60
#>
#> === Training set cells by type:
#> - CellType1: 5
#> - CellType2: 6
#> - CellType3: 6
#> - CellType4: 5
#> === Test set cells by type:
#> - CellType1: 2
#> - CellType2: 2
#> - CellType3: 2
#> - CellType4: 2
#> === Probability matrix for training data:
#> - Bulk RNA-Seq samples: 45
#> - Cell types: 4
#> === Probability matrix for test data:
#> - Bulk RNA-Seq samples: 15
#> - Cell types: 4
#> DONE
lapply(
X = 1:6, FUN = function(x) {
showProbPlot(
DDLS,
type.data = "train",
set = x,
type.plot = "boxplot"
)
}
)
#> [[1]]
#>
#> [[2]]
#>
#> [[3]]
#>
#> [[4]]
#>
#> [[5]]
#>
#> [[6]]
#>