Conferences >2014 International Conference...

Big Data issues in Computational Chemistry

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Digital data have become a torrent engulfing every area of business, science and engineering disciplines. In the age of Big Data, deriving values and insights from large ...Show More

Metadata

Abstract:

Digital data have become a torrent engulfing every area of business, science and engineering disciplines. In the age of Big Data, deriving values and insights from large amounts of data using rich analytics becomes an important differentiating capability for competitiveness, success and leadership in every field. Scientists and engineers of many different domains are increasingly clamouring for mechanisms to manage and analyse the massive quantities of information now available in order to obtain new answers and extract from it maximum value. Computational modelling and simulation is the central technology to numerous of these domains. Molecular Dynamics (MD) is a computational simulation technique that describes the physical forces and movements of interacting microscopic elements such atoms and molecules. MD has important applications in the fields of chemistry, biotechnology, pharmaceutical industry, energy, climate or materials science, among others. Advanced MD algorithms include not only Molecular Mechanics (MM), but also Quantum Mechanics (QM) approaches, raising important big data challenges still to be sorted out. MD simulations perform an iterative process generating large amounts of data in streaming. Current software technology is far from being able to manage, analyze and visualize the extremely large and complex data sets generated by important molecular processes. This paper analyzes the current big data limits in the Computational Chemistry field, especially in the MD processes. To overcome these challenging situations, this work provide guidance for future research including advances in scalable algorithms for data analysis, dynamic query technology, data models and storage strategies, parallel executions, I/O optimization, and interactive visual exploration and analysis of MD data.

Published in: 2014 International Conference on Future Internet of Things and Cloud

Date of Conference: 27-29 August 2014

Date Added to IEEE Xplore: 15 December 2014

Electronic ISBN:978-1-4799-4357-9

DOI: 10.1109/FiCloud.2014.69

Conference Location: Barcelona, Spain

References is not available for this document.

Contents

I. Introduction

The concept of Big Data mainly refers to data that exceeds the processing capacity of conventional database systems. Data is too big, moves to too fast and/or does not fit in classical database based architectures [1]. To address these new challenges, research innovation on elastic parallel and scalable algorithms is necessary [2]. Computational modelling and simulation are central to numerous scientific and engineering domains, being a good example of Big Data generation and analysis [3]. Basic simulation data is often 4D (three spatial dimensions and time), but additional variable types, such as vector or tensor fields, multiple variables, multiple spatial scales, parameter studies, and uncertainty analysis can increase the dimensionality. Workflows and systems for interacting, storage, managing, visualizing and analysing this data are already at the breaking point [4]. And as computations grow in complexity and fidelity and run on larger computers and clusters, the analysis of the data they generate will become more challenging still [5].

Select All

R. Casado, "The three generations of Big Data processing," in Big Data Spain, 2013.

Google Scholar

A. G. W. Paper, "'Big Data:' Big Challenge, Big Opportunity," pp. 1-6.

Google Scholar

H. Ode, M. Nakashima, S. Kitamura, W. Sugiura, and H. Sato, "Molecular Dynamics Simulation in Virus Research," Front. Microbiol., vol. 3, 2012.

CrossRef Google Scholar

Y. Demchenko, Z. Zhao, P. Grosso, A. Wibisono, C. De Laat, and N. E. Group, "Big Data Challenges for e-Science Infrastructure."

Google Scholar

I. O'Reilly, Big Data Now: 2012 Edition. O'Reilly Media, 2012.

Google Scholar

D. Frenkel, B. Smit, and M. A. Ratner, "Understanding Molecular Simulation: From Algorithms to Applications," Phys. Today, vol. 50, no. 7, p. 66, 1997.

CrossRef Google Scholar

P.-E. Bernard, T. Gautier, and D. Trystram, "Large scale simulation of parallel molecular dynamics," in Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999, pp. 638-644.

View Article

Google Scholar

M. Zhou, U. States, J. Grimmer, G. King, and Q. S. Science, "The Age of Big Data," 2012.

Google Scholar

B. C. Gibb, "Big (chemistry) data.," Nat. Chem., vol. 5, no. 4, pp. 248-9, Apr. 2013.

CrossRef Google Scholar

10.

T. Tiankai, C. A. Rendleman, D. W. Borhani, R. O. Dror, J. Gullingsrud, M. O. Jensen, J. L. Klepeis, P. Maragakis, P. Miller, K. A. Stafford, D. E. Shaw, and T. Tu, "A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories," in High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008. International Conference for, 2008, no. 1, pp. 1-12.

View Article

Google Scholar

11.

S. Jiao, C. He, Y. Dou, and H. Tang, "Molecular dynamics simulation: Implementation and optimization based on Hadoop," in 2012 8th International Conference on Natural Computation, 2012, pp. 1203-1207.

View Article

Google Scholar

12.

N. Allsopp, G. Ruocco, and a Fratalocchi, "Molecular dynamics beyonds the limits: massive scaling on 72 racks of a BlueGene/P and supercooled glass transition of a 1 billion particles system," Cogn. Sci., vol. cond-mat.s, no. 8, p. 14, Apr. 2011.

Google Scholar

13.

S. Kumar, V. Pascucci, V. Vishwanath, P. Carns, M. Hereld, R. Latham, T. Peterka, M. E. Papka, and R. Ross, "Towards parallel access of multi-dimensional, multi-resolution scientific data," Petascale Data Storage Work. PDSW 2010 5th, vol. 1, no. c, pp. 1-5, 2010.

View Article

Google Scholar

14.

Y. Duan, C. Wu, S. Chowdhury, M. C. Lee, G. Xiong, W. Zhang, R. Yang, P. Cieplak, R. Luo, T. Lee, J. Caldwell, J. Wang, and P. Kollman, "A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations.," J. Comput. Chem., vol. 24, no. 16, pp. 1999-2012, 2003.

CrossRef Google Scholar

15.

F. Dehne and H. Zaboli, "Parallel Real-Time OLAP on Multi-core Processors," in 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), 2012, pp. 588-594.

View Article

Google Scholar

16.

J. E. Stone, K. L. Vandivort, and K. Schulten, "GPU-accelerated molecular visualization on petascale supercomputing platforms," in Proceedings of the 8th International Workshop on Ultrascale Visualization-UltraVis '13, 2013, pp. 1-8.

CrossRef Google Scholar

17.

P. J. Sadalage and M. Fowler, NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence [Tapa blanda]. 2012.

Google Scholar

18.

T. Estrada, B. Zhang, P. Cicotti, R. S. Armen, and M. Taufer, "A scalable and accurate method for classifying protein-ligand binding geometries using a MapReduce approach.," Comput. Biol. Med., vol. 42, no. 7, pp. 758-71, Jul. 2012.

CrossRef Google Scholar

19.

"Quantum molecular modeling with simulated annealing-A distributed processing and visualization application," in Proceedings SUPERCOMPUTING '90, pp. 816-825.

Google Scholar

References is not available for this document.

Big Data issues in Computational Chemistry

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Big Data issues in Computational Chemistry

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?