Calculate gradients of predicted cell types/loss function with respect to input features for interpreting trained deconvolution models

This function enables users to gain insights into the interpretability of the deconvolution model. It calculates the gradients of classes/loss function with respect to the input features used in training. These numeric values are calculated per gene and cell type in pure mixed transcriptional profiles, providing information on the extent to which each feature influences the model's prediction of cell proportions for each cell type.

Usage

interGradientsDL(
  object,
  method = "class",
  normalize = TRUE,
  scaling = "standardize",
  verbose = TRUE
)

Arguments

object: SpatialDDLS object containing a trained deconvolution model (trained.model slot) and pure mixed transcriptional profiles (mixed.profiles slot).
method: Method to calculate gradients with respect to inputs. It can be 'class' (gradients of predicted classes w.r.t. inputs), 'loss' (gradients of loss w.r.t. inputs) or 'both'.
normalize: Whether to normalize data using logCPM (TRUE by default). This parameter is only considered when the method used to simulate the mixed transcriptional profiles (simMixedProfiles function) was "AddRawCount". Otherwise, data were already normalized. This parameter should be set according to the transformation used to train the model.
scaling: How to scale data. It can be: "standardize" (values are centered around the mean with a unit standard deviation), "rescale" (values are shifted and rescaled so that they end up ranging between 0 and 1, by default) or "none" (no scaling is performed). This parameter should be set according to the transformation used to train the model.
verbose: Show informative messages during the execution (TRUE by default).

Value

Object containing gradients in the interpret.gradients slot of the DeconvDLModel object (trained.model slot).

Details

Gradients of classes / loss function with respect to the input features are calculated exclusively using pure mixed transcriptional profiles composed of a single cell type. Consequently, these numbers can be interpreted as the extent to which each feature is being used to predict each cell type proportion. Gradients are calculated at the sample level for each gene, but only mean gradients by cell type are reported. For additional details, see Mañanes et al., 2023.

Examples

# \donttest{
set.seed(123)
sce <- SingleCellExperiment::SingleCellExperiment(
  assays = list(
    counts = matrix(
      rpois(30, lambda = 5), nrow = 15, ncol = 10,
      dimnames = list(paste0("Gene", seq(15)), paste0("RHC", seq(10)))
    )
  ),
  colData = data.frame(
    Cell_ID = paste0("RHC", seq(10)),
    Cell_Type = sample(x = paste0("CellType", seq(2)), size = 10,
                       replace = TRUE)
  ),
  rowData = data.frame(
    Gene_ID = paste0("Gene", seq(15))
  )
)
SDDLS <- createSpatialDDLSobject(
  sc.data = sce,
  sc.cell.ID.column = "Cell_ID",
  sc.gene.ID.column = "Gene_ID",
  sc.filt.genes.cluster = FALSE
)
#> === Spatial transcriptomics data not provided
#> === Processing single-cell data
#>       - Filtering features:
#>          - Selected features: 15
#>          - Discarded features: 0
#> 
#> === No mitochondrial genes were found by using ^mt- as regrex
#> 
#> === Final number of dimensions for further analyses: 15
SDDLS <- genMixedCellProp(
  object = SDDLS,
  cell.ID.column = "Cell_ID",
  cell.type.column = "Cell_Type",
  num.sim.spots = 50,
  train.freq.cells = 2/3,
  train.freq.spots = 2/3,
  verbose = TRUE
)
#> 
#> === The number of mixed profiles that will be generated is equal to 50
#> 
#> === Training set cells by type:
#>     - CellType1: 4
#>     - CellType2: 3
#> === Test set cells by type:
#>     - CellType1: 2
#>     - CellType2: 1
#> === Probability matrix for training data:
#>     - Mixed spots: 34
#>     - Cell types: 2
#> === Probability matrix for test data:
#>     - Mixed spots: 16
#>     - Cell types: 2
#> DONE
SDDLS <- simMixedProfiles(SDDLS)
#> === Setting parallel environment to 1 thread(s)
#> 
#> === Generating train mixed profiles:
#> 
#> === Generating test mixed profiles:
#> 
#> DONE
SDDLS <- trainDeconvModel(
  object = SDDLS,
  batch.size = 12,
  num.epochs = 5
)
#> === Training and test from stored data
#>     Using only simulated mixed samples
#>     Using only simulated mixed samples
#> Model: "SpatialDDLS"
#> _____________________________________________________________________
#> Layer (type)                   Output Shape               Param #    
#> =====================================================================
#> Dense1 (Dense)                 (None, 200)                3200       
#> _____________________________________________________________________
#> BatchNormalization1 (BatchNorm (None, 200)                800        
#> _____________________________________________________________________
#> Activation1 (Activation)       (None, 200)                0          
#> _____________________________________________________________________
#> Dropout1 (Dropout)             (None, 200)                0          
#> _____________________________________________________________________
#> Dense2 (Dense)                 (None, 200)                40200      
#> _____________________________________________________________________
#> BatchNormalization2 (BatchNorm (None, 200)                800        
#> _____________________________________________________________________
#> Activation2 (Activation)       (None, 200)                0          
#> _____________________________________________________________________
#> Dropout2 (Dropout)             (None, 200)                0          
#> _____________________________________________________________________
#> Dense3 (Dense)                 (None, 2)                  402        
#> _____________________________________________________________________
#> BatchNormalization3 (BatchNorm (None, 2)                  8          
#> _____________________________________________________________________
#> ActivationSoftmax (Activation) (None, 2)                  0          
#> =====================================================================
#> Total params: 45,410
#> Trainable params: 44,606
#> Non-trainable params: 804
#> _____________________________________________________________________
#> 
#> === Training DNN with 34 samples:
#> 
#> === Evaluating DNN in test data (16 samples)
#>    - loss: NaN
#>    - accuracy: 0.5
#>    - mean_absolute_error: NaN
#>    - categorical_accuracy: 0.5
#> 
#> === Generating prediction results using test data
#> DONE
## calculating gradients
SDDLS <- interGradientsDL(SDDLS)
# }