Run the quanTIseq algorithm — run_quantiseq • quantiseqr

Use quanTIseq to deconvolute a gene expression matrix.

run_quantiseq(
  expression_data,
  signature_matrix = "TIL10",
  is_arraydata = FALSE,
  is_tumordata = FALSE,
  scale_mRNA = TRUE,
  method = "lsei",
  column = "gene_symbol",
  rm_genes = NULL,
  return_se = is(expression_data, "SummarizedExperiment")
)

Arguments

expression_data

The gene expression information, containing the TPM values for the measured features. Can be provided as

a simple gene expression matrix, or a data frame (with HGNC gene symbols as row names and sample identifiers as column names)
an ExpressionSet object (from the Biobase package), where the HGNC gene symbols are provided in a column of the fData slot - that is specified by the column parameter below
a SummarizedExperiment object, or any of the derivative classes (e.g. DESeq2's DESeqDataSet), in which the assay (default: "abundance") is containing the TPMs as expected. Internally, quantiseqr handles the conversion to an object which is used in the deconvolution procedure.

signature_matrix

Character string, specifying the name of the signature matrix. At the moment, only the original TIL10 signature can be selected.

is_arraydata

Logical value. Should be set to TRUE if the expression data are originating from microarray data. For RNA-seq data, this has to be FALSE (default value). If set to TRUE, the rmgenes parameter (see below) is set to "none".

is_tumordata

Logical value. Should be set to TRUE if the expression data is from tumor samples. Default: FALSE (e.g. for RNA-seq from blood samples)

scale_mRNA

Logical value. If set to FALSE, it disables the correction of cell-type-specific mRNA content bias. Default: TRUE

method

Character string, defining the deconvolution method to be used: lsei for constrained least squares regression, hampel, huber, or bisquare for robust regression with Huber, Hampel, or Tukey bisquare estimators, respectively. Default: lsei.

column

Character, specifies which column in the fData slot (for the ExpressionSet object) contains the information of the HGNC gene symbol identifiers

rm_genes

Character vector, specifying which genes have to be excluded from the deconvolution analysis. It can be provided as

a vector of gene symbols (contained in the expression_data)
a single string among the choices of "none" (no genes are removed) and "default" (a list of genes with noisy expression RNA-seq data is removed, as explained in the quanTIseq paper). Default: "default" for RNA-seq data, "none" for microarrays.

return_se

Logical value, controls the format of how the quantification is returned. If providing a SummarizedExperiment, it can simply extend its colData component, without the need to create a separate data frame as output.

Value

A data.frame containing the quantifications of the cell type proportions, or alternatively, if providing expression_data as SummarizedExperiment and setting return_se to TRUE, a SummarizedExperiment with the quantifications included by expanding the colData slot of the original object

Details

The values contained in the expression_data need to be provided as TPM values, as this is the format also used to store the TIL10 signature, upon which quanTIseq builds to perform the immune cell type deconvolution. Expression data should not be provided in logarithmic scale.

If providing the expression_data as a SummarizedExperiment/DESeqDataSet object, it might be beneficial that this has been created via tximport - if this is the case, the assay named "abundance" will be automatically created upon importing the transcript quantification results.

References

F. Finotello, C. Mayer, C. Plattner, G. Laschober, D. Rieder, H. Hackl, A. Krogsdam, Z. Loncova, W. Posch, D. Wilflingseder, S. Sopper, M. Jsselsteijn, T. P. Brouwer, D. Johnsons, Y. Xu, Y. Wang, M. E. Sanders, M. V. Estrada, P. Ericsson-Gonzalez, P. Charoentong, J. Balko, N. F. d. C. C. de Miranda, Z. Trajanoski. "Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data". Genome Medicine 2019;11(1):34. doi: 10.1186/s13073-019-0638-6.

C. Plattner, F. Finotello, D. Rieder. "Chapter Ten - Deconvoluting tumor-infiltrating immune cells from RNA-seq data using quanTIseq". Methods in Enzymology, 2020. doi: 10.1016/bs.mie.2019.05.056.

Examples


data(dataset_racle)
dim(dataset_racle$expr_mat)
#> [1] 32467     4
res_quantiseq_run <- quantiseqr::run_quantiseq(
  expression_data = dataset_racle$expr_mat,
  signature_matrix = "TIL10",
  is_arraydata = FALSE,
  is_tumordata = TRUE,
  scale_mRNA = TRUE
)
#> 
#> Running quanTIseq deconvolution module
#> Gene expression normalization and re-annotation (arrays: FALSE)
#> Removing 17 noisy genes
#> Removing 15 genes with high expression in tumors
#> Signature genes found in data set: 135/138 (97.83%)
#> Mixture deconvolution (method: lsei)
#> Deconvolution successful!

# using a SummarizedExperiment object
library("SummarizedExperiment")
se_racle <- SummarizedExperiment(
  assays = List(
    abundance = dataset_racle$expr_mat
  ),
  colData = DataFrame(
    SampleName = colnames(dataset_racle$expr_mat)
  )
)

res_run_SE <- quantiseqr::run_quantiseq(
    expression_data = se_racle,
    signature_matrix = "TIL10",
    is_arraydata = FALSE,
    is_tumordata = TRUE,
    scale_mRNA = TRUE
)
#> 
#> Running quanTIseq deconvolution module
#> Gene expression normalization and re-annotation (arrays: FALSE)
#> Removing 17 noisy genes
#> Removing 15 genes with high expression in tumors
#> Signature genes found in data set: 135/138 (97.83%)
#> Mixture deconvolution (method: lsei)
#> Deconvolution successful!