Network-based interactive navigation and analysis of large biological datasets

Abstract

Over the last decade, advances in high-throughput technologies have resulted in a flood of new biological data. Here, individual samples can extend up into terabyte size. While potential applications are broad, ranging from biotechnology to medical applications, the analysis of these datasets poses massive challenges. In order to make use of the produced terabytes of data, these datasets need to be integrated, need to be mapped onto existing biological knowledge, and need to be explored by experts.

We present UniPAX and BiNA, a scalable system for the integration and analysis of high-throughput data (genomics, transcriptomics, proteomics, and metabolomics) in a network context. A central data warehouse holds the core dataset. A flexible middleware can execute custom queries on this dataset and communicate with our visual analytics tool BiNA, the Biological Network Analyzer. We demonstrate how the combination of these tools permits an efficient analysis of large-scale datasets for medical applications.

Citation

[GKN+15] Gerasch, A., Küntzer, J., Niermann, P., Stöckel, S., Kaufmann, M., Kohlbacher, O., Lenhof, H.-P., Network-based interactive navigation and analysis of large biological datasets, it - Information Technology. Volume 57, Issue 1, Pages 37–48, ISSN (Online) 2196-7032, ISSN (Print) 1611-2776, DOI: 10.1515/itit-2014-1076, January 2015
Read Publication