ICB Seminar


Overcoming challenges in the integrative analysis of epigenomic data

Ph.D. Markus List, Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
www: ​https://biomedical-big-data.de e-mail: ​markus.list@wzw.tum.de

(i) While large amounts of epigenomic data are publicly available, their retrieval in a form suitable for downstream analysis is a bottleneck in current research. In a typical analysis, users are required to download huge files that span the entire genome, even if they are only interested in a small subset (e.g. promoter regions) or an aggregation thereof. Moreover, complex operations on genome-level data are not always feasible on a local computer due to resource limitations.

The DeepBlue Epigenomic Data Server and its accompanying R/Bioconductor package mitigate this issue by providing a powerful interface and API for filtering, transforming, and aggregating data from several epigenomic consortia on the server side, making it the ideal resource for bioinformaticians that seek to integrate up-to-date epigenomics resources into their workflow.

(ii) A general issue in using omics data in integrative analysis are batch effects, large cell type heterogeneity and low replicate numbers. While batch effect adjustment methods exist, it is often unclear if their application leads to an improvement in data quality.

To address this issue, we developed a method that borrows information from the Cell Ontology to establish if batch adjustment leads to a better agreement between observed pairwise similarity and similarity of cell types inferred from the ontology. A comparison of state-of-the art batch effect adjustment methods suggests that batch effects in heterogeneous datasets with low replicate numbers cannot be adequately adjusted.


References:Albrecht, F., List,M., Bock, C. and Lengauer, T. (2016) DeepBlue epigenomic data server: programmatic data retrieval and analysis of epigenome region sets. Nucleic Acids Research, ​doi:10.1093/nar/gkw211Albrecht, F.*, List, M.*, Bock, C. and Lengauer, T. (2017) DeepBlueR: large-scale epigenomic analysis in R. Oxford Bioinformatics,​ ​doi:10.1093/bioinformatics/btx099Schmidt, F.*, List, M.*, Cukuroglu, E., Köhler, S., Göke, J., & Schulz, M. H. (2018). An ontology-based method for assessing batch effect adjustment approaches in heterogeneous datasets. Bioinformatics, 34(17), i908-i916, doi:10.1093/bioinformatics/bty553*joint first authors  

We use cookies to improve your experience on our Website. We need cookies to continuously improve the services, to enable certain features and when embedding services or content of third parties, such as video player. By using our website, you agree to the use of cookies. We use different types of cookies. You can personalize your cookie settings here:

Show detail settings
Please find more information in our privacy statement.

There you may also change your settings later.