Extracts the genes with the highest loadings for each principal component, and
performs functional enrichment analysis on them using routines and algorithms from
the topGO
package
pca2go(
se,
pca_ngenes = 10000,
annotation = NULL,
inputType = "geneSymbol",
organism = "Mm",
ensToGeneSymbol = FALSE,
loadings_ngenes = 500,
background_genes = NULL,
scale = FALSE,
return_ranked_gene_loadings = FALSE,
annopkg = NULL,
...
)
A DESeqTransform()
object, with data in assay(se)
,
produced for example by either rlog()
or
varianceStabilizingTransformation()
Number of genes to use for the PCA
A data.frame
object, with row.names as gene identifiers (e.g. ENSEMBL ids)
and a column, gene_name
, containing e.g. HGNC-based gene symbols
Input format type of the gene identifiers. Will be used by the routines of topGO
Character abbreviation for the species, using org.XX.eg.db
for annotation
Logical, whether to expect ENSEMBL gene identifiers, to convert to gene symbols
with the annotation
provided
Number of genes to extract the loadings (in each direction)
Which genes to consider as background.
Logical, defaults to FALSE, scale values for the PCA
Logical, defaults to FALSE. If TRUE, simply returns a list containing the top ranked genes with hi loadings in each PC and in each direction
String containing the name of the organism annotation package. Can be used to
override the organism
parameter, e.g. in case of alternative identifiers used
in the annotation package (Arabidopsis with TAIR)
Further parameters to be passed to the topGO routine
A nested list object containing for each principal component the terms enriched
in each direction. This object is to be thought in combination with the displaying feature
of the main pcaExplorer()
function
library("airway")
library("DESeq2")
data("airway", package = "airway")
airway
#> class: RangedSummarizedExperiment
#> dim: 63677 8
#> metadata(1): ''
#> assays(1): counts
#> rownames(63677): ENSG00000000003 ENSG00000000005 ... ENSG00000273492
#> ENSG00000273493
#> rowData names(10): gene_id gene_name ... seq_coord_system symbol
#> colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
#> colData names(9): SampleName cell ... Sample BioSample
dds_airway <- DESeqDataSet(airway, design= ~ cell + dex)
if (FALSE) { # \dontrun{
rld_airway <- rlogTransformation(dds_airway)
# constructing the annotation object
anno_df <- data.frame(gene_id = rownames(dds_airway),
stringsAsFactors = FALSE)
library("AnnotationDbi")
library("org.Hs.eg.db")
anno_df$gene_name <- mapIds(org.Hs.eg.db,
keys = anno_df$gene_id,
column = "SYMBOL",
keytype = "ENSEMBL",
multiVals = "first")
rownames(anno_df) <- anno_df$gene_id
bg_ids <- rownames(dds_airway)[rowSums(counts(dds_airway)) > 0]
library(topGO)
pca2go_airway <- pca2go(rld_airway,
annotation = anno_df,
organism = "Hs",
ensToGeneSymbol = TRUE,
background_genes = bg_ids)
} # }