Loading [MathJax]/extensions/MathMenu.js
Big Data issues in Computational Chemistry | IEEE Conference Publication | IEEE Xplore

Big Data issues in Computational Chemistry


Abstract:

Digital data have become a torrent engulfing every area of business, science and engineering disciplines. In the age of Big Data, deriving values and insights from large ...Show More

Abstract:

Digital data have become a torrent engulfing every area of business, science and engineering disciplines. In the age of Big Data, deriving values and insights from large amounts of data using rich analytics becomes an important differentiating capability for competitiveness, success and leadership in every field. Scientists and engineers of many different domains are increasingly clamouring for mechanisms to manage and analyse the massive quantities of information now available in order to obtain new answers and extract from it maximum value. Computational modelling and simulation is the central technology to numerous of these domains. Molecular Dynamics (MD) is a computational simulation technique that describes the physical forces and movements of interacting microscopic elements such atoms and molecules. MD has important applications in the fields of chemistry, biotechnology, pharmaceutical industry, energy, climate or materials science, among others. Advanced MD algorithms include not only Molecular Mechanics (MM), but also Quantum Mechanics (QM) approaches, raising important big data challenges still to be sorted out. MD simulations perform an iterative process generating large amounts of data in streaming. Current software technology is far from being able to manage, analyze and visualize the extremely large and complex data sets generated by important molecular processes. This paper analyzes the current big data limits in the Computational Chemistry field, especially in the MD processes. To overcome these challenging situations, this work provide guidance for future research including advances in scalable algorithms for data analysis, dynamic query technology, data models and storage strategies, parallel executions, I/O optimization, and interactive visual exploration and analysis of MD data.
Date of Conference: 27-29 August 2014
Date Added to IEEE Xplore: 15 December 2014
Electronic ISBN:978-1-4799-4357-9
Conference Location: Barcelona, Spain
Citations are not available for this document.

I. Introduction

The concept of Big Data mainly refers to data that exceeds the processing capacity of conventional database systems. Data is too big, moves to too fast and/or does not fit in classical database based architectures [1]. To address these new challenges, research innovation on elastic parallel and scalable algorithms is necessary [2]. Computational modelling and simulation are central to numerous scientific and engineering domains, being a good example of Big Data generation and analysis [3]. Basic simulation data is often 4D (three spatial dimensions and time), but additional variable types, such as vector or tensor fields, multiple variables, multiple spatial scales, parameter studies, and uncertainty analysis can increase the dimensionality. Workflows and systems for interacting, storage, managing, visualizing and analysing this data are already at the breaking point [4]. And as computations grow in complexity and fidelity and run on larger computers and clusters, the analysis of the data they generate will become more challenging still [5].

Getting results...

Contact IEEE to Subscribe

References

References is not available for this document.