Loading [MathJax]/extensions/MathZoom.js

Showing 1-25 of 29 results

Filter Results

Show

Results

Large-scale HPC applications are highly data-intensive with significant times spent in I/O operations. Many large-scale scientific applications do not adequately optimize the I/O operations, leading to overall poor performance. In this work, we have developed two main strategies for providing fast I/O throughput for an important climate modeling application, namely, Regional Ocean Modeling System ...Show More
GCNs have been increasingly used for the classification of brain functional networks to aid early prediction of neurodegenerative diseases. It is important to analyze the performance and capabilities of these GCNs for the classification and obtain insights on how and when the GCNs can be used. In this work, we perform detailed analyses of the performance of GCNs for the classification of brain fun...Show More
Preconditioned Conjugate Gradient (PCG) method is a widely used iterative method for solving large linear systems of equations. Pipelined variants of PCG present independent computations in the PCG method and overlap these computations with non-blocking allreduces. We have developed a novel pipelined PCG algorithm called PIPE-sCG (Pipelined s-step Conjugate Gradient) that provides a large overlap ...Show More
Preconditioned Conjugate Gradient (PCG) method has been one of the widely used methods for solving linear systems of equations for sparse problems. Pipelined PCG (PIPECG) attempts to eliminate the dependencies in the computations in the PCG algorithm and overlap non-dependent computations by reorganizing the traditional PCG code and using non-blocking allreduces. We have developed a novel pipeline...Show More
K-Nearest Neighbor (k-NN) search is one of the most commonly used approaches for similarity search. It finds extensive applications in machine learning and data mining. This era of big data warrants efficiently scaling k-NN search algorithms for billion-scale datasets with high dimensionality. In this paper, we propose a solution towards this end where we use vantage point trees for partitioning t...Show More
Community detection is an important problem that is widely applied for finding cluster patterns in brain, social, biological and many other kinds of networks. In this work, we propose a divide-and-conquer community detection algorithm for hybrid CPU-GPU systems. The graph representing a network is partitioned among the CPU and GPU devices of a node, and independent community detection using Louvai...Show More
Knowledge Graph Embedding (KGE) is used to represent the entities and relations of a KG in a low dimensional vector space. KGE can then be used in a downstream task such as entity classification, link prediction and knowledge base completion. Training on large KG datasets takes a considerable amount of time. This work proposes three strategies which lead to faster training in distributed setting. ...Show More
High performance grid computing is a key enabler of large scale collaborative computational science. With the promise of exascale computing, high performance grid systems are expected to incur electricity bills that grow super-linearly overtime. In order to achieve cost effectiveness in these systems, it is essential for the scheduling algorithms to exploit electricity price variations, both in sp...Show More
Deep and shallow convection calculations occupy significant times in atmosphere models. These calculations also present significant load imbalances due to varying cloud covers over different regions of the grid. In this work, we accelerate these calculations on Intel® Xeon Phi™ Coprocessor Systems. By employing dynamic scheduling in OpenMP, we demonstrate large reductions in load imbalance and abo...Show More
Supercomputers have batch queues to which parallel jobs with specific requirements are submitted. Commercial schedulers come with various configurable parameters for the queues which can be adjusted based on the requirements of the system. The employed configuration affects both system utilization and job response times. Often times, choosing an optimal configuration with good performance is not s...Show More
Accelerators and co-processors are widely prevalent and have been used to provide high performance for many scientific applications. Intel® Xeon Phi™ coprocessors have been gaining ground to provide speedups for advanced scientific applications. However, the use and demonstration of these coprocessors for climate modeling are limited. In this work, we have developed a comprehensive set of novel te...Show More
High performance grid computing is a key enabler of large scale collaborative computational science. With the promise of exascale computing, high performance grid systems are expected to incur electricity bills that grow super-linearly over time. In order to achieve cost effectiveness in these systems, it is essential for the scheduling algorithms to exploit electricity price variations, both in s...Show More
Performance predictions for large problem sizes and processors using limited small scale runs are useful for a variety of purposes including scalability projections, and help in minimizing the time taken for constructing training data for building performance models. In this paper, we present a prediction framework that matches execution signatures for performance predictions of HPC applications u...Show More
Exascale systems of the future are predicted to have mean time between failures (MTBF) of less than one hour. At such low MTBFs, employing periodic checkpointing alone will result in low efficiency because of the high number of application failures resulting in large amount of lost work due to rollbacks. In such scenarios, it is highly necessary to have proactive fault tolerance mechanisms that ca...Show More
In this paper, we discuss the acceleration of a climate model known as the Community Earth System Model (CESM). The use of Graphics Processor Units (GPUs) to accelerate scientific applications that are computationally intensive is well known. This work attempts to extract the performance of GPUs to enable faster execution of CESM and obtain better model throughput. We focus on two major routines t...Show More
Homology computations form an important step in topological data analysis that helps to identify connected components, holes, and voids in multi-dimensional data. Our work focuses on algorithms for homology computations of large simplicial complexes on multicore machines and on GPUs. This paper presents two parallel algorithms to compute homology. A core component of both algorithms is the algebra...Show More
Many meteorological phenomena occur at different locations simultaneously. These phenomena vary temporally and spatially. It is essential to track these multiple phenomena for accurate weather prediction. Efficient analysis require high-resolution simulations which can be conducted by introducing finer resolution nested simulations, nests at the locations of these phenomena. Simultaneous tracking ...Show More
Accurate and timely prediction of weather phenomena, such as hurricanes and flash floods, require high-fidelity compute intensive simulations of multiple finer regions of interest within a coarse simulation domain. Current weather applications execute these nested simulations sequentially using all the available processors, which is sub-optimal due to their sub-linear scalability. In this work, we...Show More
Critical applications like cyclone tracking and earthquake modeling require simultaneous high-performance simulations and online visualization for timely analysis. Faster simulations and simultaneous visualization enable scientists provide real-time guidance to decision makers. In this work, we have developed an integrated user-driven and automated steering framework that simultaneously performs n...Show More
Critical climate applications like cyclone tracking and earthquake modeling require high-performance simulations and online visualization simultaneously performed with the simulations for timely analysis. Remote visualization of critical climate events enables joint analysis by geographically distributed climate science community. However, resource constraints including limited storage and slow ne...Show More
A phylogenetic or evolutionary tree is constructed from a set of species or DNA sequences and depicts the relatedness between the sequences. Predictions of future sequences in a phylogenetic tree are important for a variety of applications including drug discovery, pharmaceutical research and disease control. In this work, we predict future DNA sequences in a phylogenetic tree using cellular autom...Show More
The challenge for the development of next-generation software is the successful management of the complex computational environment while delivering to the scientist the full power of flexible compositions of the available algorithmic alternatives. Self-adapting numerical software (SANS) systems are intended to meet this significant challenge. The process of arriving at an efficient numerical solu...Show More
Modeling the performance behavior of parallel applications to predict the execution times of the applications for larger problem sizes and number of processors has been an active area of research for several years. The existing curve fitting strategies for performance modeling utilize data from experiments that are conducted under uniform loading conditions. Hence the accuracy of these models degr...Show More
Due to the importance of collective communications in scientific parallel applications, many strategies have been devised for optimizing collective communications for different kinds of parallel environments. There has been an increasing interest to evolve efficient broadcast algorithms for computational grids. In this paper, we present application-oriented adaptive techniques that take into accou...Show More
At least three factors in the existing migration frameworks make them less suitable in Grid systems especially when the goal is to improve the response times for individual applications. These factors are the separate policies for suspension and migration of executing applications employed by these migration frameworks, the use of pre-defined conditions for suspension and migration and the lack of...Show More