原文地址:https://yulab-smu.github.io/clusterProfiler-book/chapter12.html
The enrichplot package implements several visualization methodsto help interpreting enrichment results. It supports visualizing enrichmentresults obtained from DOSE (Yu et al. 2015),clusterProfiler (???),ReactomePA (Yu and He 2016) and meshes. Bothover representation analysis (ORA) and gene set enrichment analysis (GSEA) aresupported.
Many of these visualization methods were first implemented in DOSE and rewrote from scratch using ggplot2
. If you want to use old methods3, you can use the doseplot package.
Bar plot is the most widely used method to visualize enriched terms. It depictsthe enrichment scores (e.g. p values) and gene count or ratio as bar heightand color.
Dot plot is similar to bar plot with the capability to encode another score asdot size.
edo2 <- gseNCG(geneList, nPerm=10000)p1 <- dotplot(edo, showCategory=30) + ggtitle("dotplot for ORA")p2 <- dotplot(edo2, showCategory=30) + ggtitle("dotplot for GSEA")plot_grid(p1, p2, ncol=2)
Both the barplot
and dotplot
only displayed most significant enriched terms,while users may want to know which genes are involved in these significantterms.In order to consider the potentially biological complexities in which a gene may belong to multiple annotation categories and provide information of numeric changes if available, we developed cnetplot
function to extract the complex association.The cnetplot
depicts the linkages of genes and biological concepts (e.g. GO terms or KEGG pathways) as a network. GSEA result is also supportedwith only core enriched genes displayed.
## convert gene ID to Symboledox <- setReadable(edo, 'org.Hs.eg.db', 'ENTREZID')cnetplot(edox, foldChange=geneList)
## categorySize can be scaled by 'pvalue' or 'geneNum'cnetplot(edox, categorySize="pvalue", foldChange=geneList)
The heatplot
is similar to cnetplot
, while displaying the relationships as aheatmap. The gene-concept network may become too complicated if user want toshow a large number significant terms. The heatplot
can simplify the resultand more easy to identify expression patterns.
Enrichment map organizes enriched terms into a network with edges connectingoverlapping gene sets. In this way, mutually overlapping gene sets are tend tocluster together, making it easy to identify functional module.
The emapplot
function supports results obtained from hypergeometric test and gene set enrichment analysis.
The upsetplot
is an alternative to cnetplot
for visualizing the complexassociation between genes and gene sets. It emphasizes the gene overlappingamong different gene sets.
The ridgeplot
will visualize expression distributions of core enriched genesfor GSEA enriched categories. It helps users to interpret up/down-regulated pathways.
Running score and preranked list are traditional methods for visualizing GSEAresult. The enrichplot package supports both of them to visualizethe distribution of the gene set and the enrichment score.
Another method to plot GSEA result is the gseaplot2
function:
The gseaplot2
also supports multile gene sets to be displayed on the same figure:
User can also displaying the pvalue table on the plot via pvalue_table
parameter:
gseaplot2(edo2, geneSetID = 1:3, pvalue_table = TRUE, color = c("#E495A5", "#86B875", "#7DB0DD"), ES_geom = "dot")
User can specify subplots
to only display a subset of plots:
The gsearank
function plot the ranked list of genes belong to the specificgene set.
Multiple gene sets can be aligned using cowplot
:
library(ggplot2)library(cowplot)pp <- lapply(1:3, function(i) { anno <- edo2[i, c("NES", "pvalue", "p.adjust")] lab <- paste0(names(anno), "=", round(anno, 3), collapse="\n") gsearank(edo2, i, edo2[i, 2]) + xlab(NULL) +ylab(NULL) + annotate("text", 0, edo2[i, "enrichmentScore"] * .9, label = lab, hjust=0, vjust=0)})plot_grid(plotlist=pp, ncol=1)
One of the problem of enrichment analysis is to find pathways for furtherinvestigation. Here, we provide pmcplot
function to plot the number/proportionof publications trend based on the query result from PubMed Central. Of course,users can use pmcplot
in other scenarios. All text that can be queried on PMCis valid as input of pmcplot
.
terms <- edo$Description[1:3]p <- pmcplot(terms, 2010:2017)p2 <- pmcplot(terms, 2010:2017, proportion=FALSE)plot_grid(p, p2, ncol=2)
goplot
can accept output of enrichGO
and visualized the enriched GO induced graph.
To view the KEGG pathway, user can use browseKEGG
function, which will open web browser and highlight enriched genes.
clusterProfiler users can also use pathview
from the pathview(Luo and Brouwer 2013) to visualize KEGG pathway.
The following example illustrate how to visualize “hsa04110” pathway, which was enriched in our previous analysis.
library("pathview")hsa04110 <- pathview(gene.data = geneList, pathway.id = "hsa04110", species = "hsa", limit = list(gene=max(abs(geneList)), cpd=1))
For further information, please refer to the vignette of pathview(Luo and Brouwer 2013).