PDB-Hadoop: Parallelising user applications on the protein databank using Apache Hadoop

Jamie AlNasir; Hugh Shanahan

PDB-Hadoop: Parallelising user applications on the protein databank using Apache Hadoop

Research output: Contribution to conference › Poster › peer-review

227 Downloads (Pure)

Abstract

We present a framework that facilitates parallel execution of protein structure analysis tools to be carried out on the entire (or subsets of) the Protein Databank (PDB) using the Apache Hadoop platform. Our design enables structural Biologists to use the Hadoop platform without having to explicitly write Map-Reduce code. It is easily scalable and uses a mapper architecture that functions on a stand- alone basis or can be extended to include further Map-Reduce operations.

Original language	English
Publication status	Published - Jul 2015
Event	3DSig Structural Bioinformatics and Computational Biophysics 2015 - Dublin, Ireland Duration: 10 Jul 2015 → 11 Jul 2015

Conference

Conference	3DSig Structural Bioinformatics and Computational Biophysics 2015
Country/Territory	Ireland
City	Dublin
Period	10/07/15 → 11/07/15

Access to Document

Final ManuscriptFinal published version, 119 KBLicence: CC BY

Cite this

@conference{36e385c21f7a46aebc67632202913892,

title = "PDB-Hadoop: Parallelising user applications on the protein databank using Apache Hadoop",

abstract = "We present a framework that facilitates parallel execution of protein structure analysis tools to be carried out on the entire (or subsets of) the Protein Databank (PDB) using the Apache Hadoop platform. Our design enables structural Biologists to use the Hadoop platform without having to explicitly write Map-Reduce code. It is easily scalable and uses a mapper architecture that functions on a stand- alone basis or can be extended to include further Map-Reduce operations.",

author = "Jamie AlNasir and Hugh Shanahan",

year = "2015",

month = jul,

language = "English",

note = "3DSig Structural Bioinformatics and Computational Biophysics 2015 ; Conference date: 10-07-2015 Through 11-07-2015",

}

TY - CONF

T1 - PDB-Hadoop

T2 - 3DSig Structural Bioinformatics and Computational Biophysics 2015

AU - AlNasir, Jamie

AU - Shanahan, Hugh

PY - 2015/7

Y1 - 2015/7

N2 - We present a framework that facilitates parallel execution of protein structure analysis tools to be carried out on the entire (or subsets of) the Protein Databank (PDB) using the Apache Hadoop platform. Our design enables structural Biologists to use the Hadoop platform without having to explicitly write Map-Reduce code. It is easily scalable and uses a mapper architecture that functions on a stand- alone basis or can be extended to include further Map-Reduce operations.

AB - We present a framework that facilitates parallel execution of protein structure analysis tools to be carried out on the entire (or subsets of) the Protein Databank (PDB) using the Apache Hadoop platform. Our design enables structural Biologists to use the Hadoop platform without having to explicitly write Map-Reduce code. It is easily scalable and uses a mapper architecture that functions on a stand- alone basis or can be extended to include further Map-Reduce operations.

M3 - Poster

Y2 - 10 July 2015 through 11 July 2015

ER -