PDB-Hadoop: Parallelising user applications on the protein databank using Apache Hadoop

Jamie AlNasir, Hugh Shanahan

Research output: Contribution to conferencePosterpeer-review

216 Downloads (Pure)

Abstract

We present a framework that facilitates parallel execution of protein structure analysis tools to be carried out on the entire (or subsets of) the Protein Databank (PDB) using the Apache Hadoop platform. Our design enables structural Biologists to use the Hadoop platform without having to explicitly write Map-Reduce code. It is easily scalable and uses a mapper architecture that functions on a stand- alone basis or can be extended to include further Map-Reduce operations.
Original languageEnglish
Publication statusPublished - Jul 2015
Event3DSig Structural Bioinformatics and Computational Biophysics 2015 - Dublin, Ireland
Duration: 10 Jul 201511 Jul 2015

Conference

Conference3DSig Structural Bioinformatics and Computational Biophysics 2015
Country/TerritoryIreland
CityDublin
Period10/07/1511/07/15

Cite this