Extract functional terms enriched in the DE genes, based on goseq

A wrapper for extracting functional GO terms enriched in a list of (DE) genes, based on the algorithm and the implementation in the goseq package

goseqTable(
  de.genes,
  assayed.genes,
  genome = "hg38",
  id = "ensGene",
  testCats = c("GO:BP", "GO:MF", "GO:CC"),
  FDR_GO_cutoff = 1,
  nTop = 200,
  orgDbPkg = "org.Hs.eg.db",
  addGeneToTerms = TRUE
)

Arguments

de.genes: A vector of (differentially expressed) genes
assayed.genes: A vector of background genes, e.g. all (expressed) genes in the assays
genome: A string identifying the genome that genes refer to, as in the goseq::goseq() function
id: A string identifying the gene identifier used by genes, as in the goseq::goseq() function
testCats: A vector specifying which categories to test for over representation amongst DE genes - can be any combination of "GO:CC", "GO:BP", "GO:MF" & "KEGG"
FDR_GO_cutoff: Numeric value for subsetting the results
nTop: Number of categories to extract, and optionally process for adding genes to the respective terms
orgDbPkg: Character string, named as the org.XX.eg.db package which should be available in Bioconductor
addGeneToTerms: Logical, whether to add a column with all genes annotated to each GO term

Value

A table containing the computed GO Terms and related enrichment scores

Details

Note: the feature length retrieval is based on the goseq::goseq() function, and requires that the corresponding TxDb packages are installed and available

Examples


library("airway")
data("airway", package = "airway")
airway
#> class: RangedSummarizedExperiment 
#> dim: 63677 8 
#> metadata(1): ''
#> assays(1): counts
#> rownames(63677): ENSG00000000003 ENSG00000000005 ... ENSG00000273492
#>   ENSG00000273493
#> rowData names(10): gene_id gene_name ... seq_coord_system symbol
#> colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
#> colData names(9): SampleName cell ... Sample BioSample
dds_airway <- DESeq2::DESeqDataSetFromMatrix(assay(airway),
  colData = colData(airway),
  design = ~ cell + dex
)
dds_airway <- DESeq2::DESeq(dds_airway)
#> estimating size factors
#> estimating dispersions
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing
res_airway <- DESeq2::results(dds_airway)

res_subset <- mosdef::deresult_to_df(res_airway)[1:100, ]
myde <- res_subset$id
myassayed <- rownames(res_airway)
if (FALSE) { # \dontrun{
mygo <- goseqTable(myde,
  myassayed,
  testCats = "GO:BP",
  addGeneToTerms = FALSE
)
head(mygo)
} # }