Distill enrichment results — distill

Distill the main topics from the enrichment results, based on the graph derived from constructing an enrichment map

distill_enrichment(
  res_enrich,
  res_de,
  annotation_obj,
  gtl = NULL,
  n_gs = nrow(res_enrich),
  cluster_fun = "cluster_markov"
)

Arguments

res_enrich: A data.frame object, storing the result of the functional enrichment analysis.
res_de: A DESeqResults object. As for the dds parameter, this is also commonly used in the DESeq2 framework.
annotation_obj: A data.frame object, containing two columns, gene_id with a set of unambiguous identifiers (e.g. ENSEMBL ids) and gene_name, containing e.g. HGNC-based gene symbols.
gtl: A GeneTonic-list object, containing in its slots the arguments specified above: dds, res_de, res_enrich, and annotation_obj - the names of the list must be specified following the content they are expecting
n_gs: Integer value, corresponding to the maximal number of gene sets to be used.
cluster_fun: Character, referring to the name of the function used for the community detection in the enrichment map graph. Could be one of "cluster_markov", "cluster_louvain", or "cluster_walktrap", as they all return a communities object.

Value

A list containing three objects:

the distilled table of enrichment, distilled_table, where the new meta-genesets are identified and defined, specifying e.g. the names of each component, and the genes associated to these.
the distilled graph for the enrichment map, distilled_em, with the information on the membership
the original res_enrich, augmented with the information of the membership related to the meta-genesets

Examples

library("macrophage")
library("DESeq2")
library("org.Hs.eg.db")
library("AnnotationDbi")

# dds object
data("gse", package = "macrophage")
dds_macrophage <- DESeqDataSet(gse, design = ~ line + condition)
#> using counts and average transcript lengths from tximeta
rownames(dds_macrophage) <- substr(rownames(dds_macrophage), 1, 15)
dds_macrophage <- estimateSizeFactors(dds_macrophage)
#> using 'avgTxLength' from assays(dds), correcting for library size

# annotation object
anno_df <- data.frame(
  gene_id = rownames(dds_macrophage),
  gene_name = mapIds(org.Hs.eg.db,
    keys = rownames(dds_macrophage),
    column = "SYMBOL",
    keytype = "ENSEMBL"
  ),
  stringsAsFactors = FALSE,
  row.names = rownames(dds_macrophage)
)
#> 'select()' returned 1:many mapping between keys and columns

# res object
data(res_de_macrophage, package = "GeneTonic")
res_de <- res_macrophage_IFNg_vs_naive

# res_enrich object
data(res_enrich_macrophage, package = "GeneTonic")
res_enrich <- shake_topGOtableResult(topgoDE_macrophage_IFNg_vs_naive)
#> Found 500 gene sets in `topGOtableResult` object.
#> Converting for usage in GeneTonic...
res_enrich <- get_aggrscores(res_enrich, res_de, anno_df)

distilled <- distill_enrichment(res_enrich,
  res_de,
  annotation_obj,
  n_gs = 100,
  cluster_fun = "cluster_markov"
)
colnames(distilled$distilled_table)
#> [1] "metags_cluster"    "metags_n_gs"       "metags_genes"     
#> [4] "metags_n_genes"    "metags_gsidlist"   "metags_gsdesclist"
#> [7] "metags_msgs"       "metags_mcgs"      
distilled$distilled_em
#> IGRAPH ef98073 UN-- 100 553 -- 
#> + attr: name (v/c), size (v/n), original_size (v/n), color_by_variable
#> | (v/n), color.background (v/c), color.highlight (v/c), color.hover
#> | (v/c), color.border (v/c), membership (v/n), color (v/n), width
#> | (e/n), color (e/c)
#> + edges from ef98073 (vertex names):
#> [1] adaptive immune response--interferon-gamma-mediated signaling pathway                                      
#> [2] adaptive immune response--antigen processing and presentation of endogenous peptide antigen via MHC class I
#> [3] adaptive immune response--positive regulation of T cell mediated cytotoxicity                              
#> + ... omitted several edges