Reengineering High-throughput Molecular Datasets for Scalable Clustering Using MapReduce (original) (raw)

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems, 2012

Abstract

Abstract We propose a linear clustering approach for large datasets of molecular geometries produced by high-throughput molecular dynamics simulations (eg, protein folding and protein-ligand docking simulations). To this scope, we transform each three-dimensional (3D) molecular conformation into a single point in the 3D space reducing the space complexity while still encoding the molecular similarities and geometries. We assign an identifier to each single 3D point mapping a docked ligand, generate a tree from the whole ...

Roger Armen hasn't uploaded this paper.

Let Roger know you want this paper to be uploaded.

Ask for this paper to be uploaded.