class: center, middle, title-slide #
iSEE
: interactive and reproducible exploration and visualization of genomics data ### Federico Marini (
marinif@uni-mainz.de
) ### 2019/07/11 useR! 2019 - Toulouse
#useR2019
--- class: center <!-- Submitted abstract: --> <!-- iSEE: interactive and reproducible exploration and visualization of genomics data --> <!-- Federico Marini 1, 2, @ , Charlotte Soneson 5, 4, 3 , Kevin Rue-Albrecht 6 , Aaron Lun 7 --> <!-- 1 : Center for Thrombosis and Hemostasis Mainz (CTH), Mainz --> <!-- 2 : Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Mainz --> <!-- 5 : Friedrich Miescher Institute for Biomedical Research --> <!-- 4 : SIB Swiss Institute of Bioinformatics --> <!-- 3 : Institute of Molecular Life Sciences, University of Zurich --> <!-- 6 : Kennedy Institute of Rheumatology, University of Oxford, Headington, Oxford --> <!-- 7 : Cancer Research UK Cambridge Institute, University of Cambridge --> <!-- Data exploration is crucial in the comprehension of large biological datasets, generated by high-throughput assays such as sequencing, with interactivity as key aspect to generate insightful outputs. Most existing tools for intuitive and interactive visualization are limited to specific assays or analyses, and lack support for reproducible analysis. --> <!-- Sparked from a Bioconductor community-driven effort, we have built a general-purpose tool, iSEE - Interactive SummarizedExperiment Explorer, designed for interactive exploration of any experimental data which can be stored in a SummarizedExperiment object, i.e. an integrative data container for storing matrices of assays and tables of associated metadata. --> <!-- iSEE (https://bioconductor.org/packages/iSEE/) is implemented in R and Shiny, and is compatible with many existing R/Bioconductor packages for high-throughput biological data. --> <!-- Essential features include: --> <!-- - A highly customizable interface with different panel types, simultaneously viewing and linking panels to each other --> <!-- - Automatic tracking of the exact R code generating all visible plots for full reproducibility --> <!-- - Interactive tours to showcase datasets and findings --> <!-- - Extendable analyses with custom panel types --> <!-- - Seamless deployment as an online companion browser for collaborations and publications. --> <!-- Subject : : abstract for oral presentation --> <!-- Topics : reproducibility --> <!-- Keywords : visualization ; interactive ; Bioconductor ; genomics ; transcriptomics ; exploration ; shiny --> <!-- TODO: --> <!-- new points to touch --> <!-- figure from Rob --> <!-- better lead in for the memes --> <!-- tweets appreciating iSEE to Drake --> <!-- plug for the shiny contest! --> <!-- focus more on the "philosophy" as well --> <!-- excellent slides: https://docs.google.com/presentation/d/1z_ycM7Rzb7DWgoWCwOTyelaTIkD7DqP8AYuNm6RVOrQ/edit#slide=id.g4fc057d4ed_0_51 --> # `Sys.getenv("USER")` I'm **Federico Marini**, Virchow Fellow @CTH Mainz/IMBEI -- I like platelets (and their transcriptome), and you should as well. -- <a href="mailto:marinif@uni-mainz.de">
`marinif@uni-mainz.de`</a><br> <a href="https://federicomarini.github.io">
`federicomarini.github.io`</a><br> <a href="http://twitter.com/FedeBioinfo">
`@FedeBioinfo`</a><br> <a href="http://github.com/federicomarini">
`@federicomarini`</a><br><br> <a href="http://www.imbei.de">
CTH/IMBEI (Mainz, Germany)</a> You can find this presentation here: [`https://federicomarini.github.io/useR2019/`](https://federicomarini.github.io/useR2019/)</br> <!--
[`@FedeBioinfo`](https://twitter.com/FedeBioinfo) --> <p align="center"> <img src="images/qrcode_user2019.png" alt="" height="170"/> </p> --- # Transcriptomics at a glance - High dimensional snapshot of the transcriptomic activity of all RNA species in a sample <!-- - Bulk or single cell? --> -- **Data**: genes `\(\times\)` samples (e.g. bulk/single cells) **Objectives**: - Compare abundances to discover differentially expressed genes, gene signatures, etc. - features associated with phenotypic differences - Identification of cell subpopulations, description of developmental trajectories, study noise in transcriptional regulation -- **Challenges** - Large sets - Heterogeneous sets - Sparse sets - Proper analysis tools _should_ combine interactivity and reproducibility - Marini and Binder (2016) - [`10.1186/s12859-019-2879-1`](https://doi.org/10.1186/s12859-019-2879-1) - Marini and Binder (2019) - [`10.18547/gcb.2017.vol3.iss1.e39`](https://doi.org/10.18547/gcb.2017.vol3.iss1.e39) - Joe Cheng's keynote earlier today π --- background-image: url("images/console_logcounts_sparse.png") background-size: contain background-position: 50% 50% class: middle, center # Exploration and visualization Effective and efficient methods are key to deliver... --
better **quality assessment** --
better **generation of research hypotheses** --
better **representation of the results** --
better **communication** of findings --- # `SummarizedExperiment`s <p align="center"> <img src="images/sce_class.png" alt="" height="450"/> </p> <!-- Thank you Martin Morgan!!! --> <!-- or the one from the OSCA paper? --> <!-- It can store [**RNA-seq**|DNA methylation|Hi-C|Microarray|Mass cytometry|SNPs] data --> If you are into single cell data, check out [`https://osca.bioconductor.org`](https://osca.bioconductor.org)! --- background-image: url("images/marie.png") background-size: cover background-position: 50% 50% class: center, bottom, inverse -- # Does the exploration of your data spark joy? --- # Looking for a silver bullet Data exploration is crucial: - No general tool for this, limited to assay types or analysis steps - No support for reproducibility while keeping it intuitive and usable -- Joint work with Aaron Lun, Charlotte Soneson, Kevin Rue-Albrecht initiated at [#EuroBioc2017](https://twitter.com/hashtag/EuroBioC2017) <p align="center"> <img src="images/lun_aaron_web.jpg" alt="" width="200"/> <img src="images/twit_charlotte.jpg" alt="" width="200"/> <img src="images/twit_kev.jpg" alt="" width="200"/> </p> -- "We could have an interactive SummarizedExperiment Explorer tool..." <!-- Visualize my data in any (precomputed) reduced dimension space. --> <!-- Color the data points with any experimental covariate (e.g. batch). --> <!-- Color the data points with any expression data. --> <!-- Select data points in a plot, and highlight them in another. --> <!-- Visualize the distribution of any assay or metadata. --> <!-- Visualize the correlation between gene A and gene B, specifically in βthisβ or βthatβ cluster --> <!-- Fully empower the data generators - and get lazy! --> --- class: animated, fadeIn # Hello `iSEE` <p align="center"> <img src="images/iSEE.png" width="370"/> </p> * [`https://f1000research.com/articles/7-741/v1`](https://f1000research.com/articles/7-741/v1), live apps inside * Available in Bioconductor [`http://bioconductor.org/packages/iSEE/`](http://bioconductor.org/packages/iSEE/) <!-- .pull-left[ --> <!-- <img src="images/ss_bioc_isee.png" alt="" width="500"/> --> <!-- ] --> <!-- .pull-right[ --> <!-- <img src="images/ss_isee.png" alt="" width="500"/> --> <!-- ] --> <!-- </br> --> <!-- Available in Bioconductor </br> --> <!-- ... or as devel version at [`https://github.com/csoneson/iSEE`](https://github.com/csoneson/iSEE) --> --- class: center, middle # `iSEE(sce)` --- # `iSEE` in action: the *Tabula Muris* dataset .pull-left[ <p align="center"> <img src="images/iSEE.png" width="170"/> </p> ] .pull-right[ <img src="images/tabulamuris.png" alt="" width="500"/> ] -- Preprocessing details + `iSEE` configuration for this set can be found at [`https://github.com/federicomarini/iSEE_instances`](https://github.com/federicomarini/iSEE_instances/tree/master/iSEE_tabulamuris) - 20 organs and tissues, from 8 different mice - 23036 genes - 43598 cells `\(\rightarrow\)` >1 billion data points! <!-- and that is the smaller set --> <!-- - First steps: `example(iSEE,ask = FALSE)` to explore the `allen` dataset --> <!-- - ... or start [`https://marionilab.cruk.cam.ac.uk/iSEE_pbmc4k/`](https://marionilab.cruk.cam.ac.uk/iSEE_pbmc4k/) --> <!-- <iframe width="720" height="400" src="https://www.youtube.com/embed/wpu9daTE4ok" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe> --> <!-- Touch upon: --> <!-- input: SummarizedExperiment --> <!-- panel types --> <!-- add/remove plots --> <!-- link plots --> <!-- show code tracker --> <!-- show tour --> <!-- voice control!! --> --- background-image: url("images/tm_isee_allpanels.png") background-size: contain background-position: 50% 50% class: center, bottom # A tour of the panels --- background-image: url("images/ss_reddim.png") background-position: 50% 50% background-size: contain class: center, animated, zoomInLeft --- background-image: url("images/tm_isee_allpanels.png") background-size: contain background-position: 50% 50% class: center, animated, fadeIn --- background-image: url("images/ss_featassay.png") background-size: contain background-position: 50% 50% class: center, animated, zoomIn --- background-image: url("images/tm_isee_allpanels.png") background-size: contain background-position: 50% 50% class: center, animated, fadeIn --- background-image: url("images/ss_samplesassay.png") background-size: contain background-position: 50% 50% class: center, animated, zoomInRight --- background-image: url("images/tm_isee_allpanels.png") background-size: contain background-position: 50% 50% class: center, animated, fadeIn --- background-image: url("images/ss_coldata_rowdata.png") background-size: contain background-position: 50% 50% class: center, animated, zoomInUp --- background-image: url("images/tm_isee_allpanels.png") background-size: contain background-position: 50% 50% class: center, animated, fadeIn --- background-image: url("images/ss_tables.png") background-size: contain background-position: 50% 50% class: center, animated, zoomInUp --- background-image: url("images/tm_isee_allpanels.png") background-size: contain background-position: 50% 50% class: center, animated, fadeIn --- background-image: url("images/gif_bsb2.gif") background-size: contain background-position: 50% 50% class: center, bottom, inverse, fadeIn # Flexibility and customizability --- background-image: url("images/gif_reordering.gif") background-size: contain background-position: 50% 50% class: center, bottom # Reordering the panels --- background-image: url("images/tm_isee_reorderedpanels.png") background-size: contain background-position: 50% 25% class: center --- background-image: url("images/gif_colorbygene.gif") background-size: contain background-position: 50% 50% class: center, bottom # Augmenting the observed data with biological knowledge --- background-image: url("images/gif_linkedtotable.gif") background-size: contain background-position: 50% 50% class: center, bottom # Linking the panels --- background-image: url("images/gif_customde.gif") background-size: contain background-position: 50% 50% class: center, bottom # Custom panels --- background-image: url("images/meme_owl.jpg") background-size: contain background-position: 50% 50% class: center, bottom # Working in a reproducible way --- background-image: url("images/brit_bored.gif") background-size: cover class: center, bottom, inverse # "Oh, code..." --- background-image: url("images/gif_reprocode.gif") background-size: contain background-position: 50% 50% class: center, bottom # Code access for full reproducibility --- background-image: url("images/brit_dance.gif") background-size: cover class: center, bottom, inverse -- # `knit` me baby one more time --- background-image: url("images/meme_gandalf.jpg") background-size: contain class: center, bottom --- background-image: url("images/gif_toursteps.gif") background-size: contain background-position: 50% 50% class: center, bottom # Interactive tours: an efficient means to communicate <!-- Markers: Pax9, Olig1, Cd68 --> <!-- # Some feedback from users - early on --> <!-- <p align="center"> --> <!-- <img src="images/tweet_saskia.png" alt="" height="400"/> --> <!-- </p> --> <!-- [`https://twitter.com/trashystats/status/1007061299568578561`](https://twitter.com/trashystats/status/1007061299568578561) --> <!-- --- --> <!-- # Some feedback from users - now --> <!-- <p align="center"> --> <!-- <img src="images/tweet_rob.png" alt="" height="400"/> --> <!-- </p> --> <!-- [`https://twitter.com/robamezquita/status/1102612120527548416`](https://twitter.com/robamezquita/status/1102612120527548416) --> <!-- --- --> <!-- --- --> <!-- # Some feedback from users --> <!-- <p align="center"> --> <!-- <img src="images/meme_drake_isee.jpg" height="450"/> --> <!-- </p> --> <!-- `not sure this one can be trusted` --> --- background-image: url("images/iSEE.png") background-size: 200px background-position: 93% 10% # πΎ π πΊ π π€ <p align="center"> <img src="images/shiny_contest_winner.gif" alt="" height="400"/> </p> [`https://blog.rstudio.com/2019/04/05/first-shiny-contest-winners`](https://blog.rstudio.com/2019/04/05/first-shiny-contest-winners) --- # Are we reinventing the wheel? [`https://github.com/federicomarini/awesome-expression-browser`](https://github.com/federicomarini/awesome-expression-browser) -- <p align="center"> <img src="images/meme_batman.jpg" alt="" height="450"/> </p> --- ## `iSEE(sce, voice = TRUE)` <iframe width="900" height="500" src="https://www.youtube-nocookie.com/embed/0crFZLwAJOE" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> <!-- Easter eggs included! --> --- background-image: url("images/iSEE.png") background-size: 200px background-position: 90% 10% # Summary - **Key features** for optimal exploration: - flexibility and customizability - linked information across plots - guided showcase/usage - effective communication - reproducibility (for self & for others!) - voice control (fun & accessibility) -- - **Got data?** Accompany your publications as live browser! -- - **Outlook**: new features + steps towards multi-omics -- > See first, think later, then test. But always see first. </br> > Otherwise you will only see what you were expecting. Most scientists forget that. > > <footer>--- Douglas Adams</footer> --- background-image: url("images/iSEE.png") background-size: 200px background-position: 90% 10% # Summary - **Key features** for optimal exploration: - flexibility and customizability - linked information across plots - guided showcase/usage - effective communication - reproducibility (for self & for others!) - voice control (fun & accessibility) - **Got data?** Accompany your publications as live browser! - **Outlook**: new features + steps towards multi-omics > ~~See~~ `iSEE` first, think later, then test. But always ~~see~~ `iSEE` first. > Otherwise you will only see what you were expecting. Most scientists forget that. > > <footer>--- Douglas Adams</footer> --- background-image: url("images/iSEE.png") background-size: 200px background-position: 90% 10% # Links <!-- a.k.a. the `iSEE`verse --> - **[`http://bioconductor.org/packages/iSEE/`](http://bioconductor.org/packages/iSEE/)** - [`https://github.com/csoneson/iSEE`](https://github.com/csoneson/iSEE) - [`https://github.com/LTLA/iSEE2018`](https://github.com/LTLA/iSEE2018) - [`https://github.com/kevinrue/iSEE_custom`](https://github.com/kevinrue/iSEE_custom) - [`https://github.com/federicomarini/iSEE_instances`](https://github.com/federicomarini/iSEE_instances) - **[`https://federicomarini.github.io/useR2019/`](https://federicomarini.github.io/useR2019/)** -- ### Acknowledgements - Center for Thrombosis and Hemostasis (CTH), Mainz (Virchow Fellowship) - IMBEI - Biostatistics & Bioinformatics division - Charlotte Soneson, Aaron Lun, Kevin Rue-Albrecht (the `iSEE` team) -- ### ... thank you for your attention! <code>marinif@uni-mainz.de</code> -
[`@FedeBioinfo`](https://twitter.com/FedeBioinfo) --- <!-- empty page --> --- # `iSEE` in action - a few lines of code ```r # in Bioc since 3.7 install.packages("BiocManager") BiocManager::install("iSEE") library("scRNAseq") data("allen") library("scater") sce <- as(allen, "SingleCellExperiment") counts(sce) <- assay(sce, "tophat_counts") sce <- normalize(sce) sce <- runPCA(sce) sce <- runTSNE(sce) *library("iSEE") *iSEE(sce) # couple of genes to check: (Zeisel, Science 2015; # Tasic, Nature Neuroscience 2016) # Tbr1 (TF required for the final differentiation of # cortical projection neurons); # Snap25 (pan-neuronal); # Rorb (mostly L4 and L5a); # Foxp2 (L6) ``` --- # `iSEE` in action - the PBMC4k dataset <iframe width="900" height="500" src="https://www.youtube.com/embed/wpu9daTE4ok" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>