Kaushik Kandadi Suresh - IEEE Xplore Author Profile

IEEE.org
IEEE Xplore
IEEE SA
IEEE Spectrum
More Sites

- Donate
- Personal Sign In

Access provided by:

MIT Libraries

Access provided by:

MIT Libraries

ADVANCED SEARCH

Author details

Kaushik Kandadi Suresh

Publications

13

Citations

25

Publications by Year

20202024

Co-Authors:

Mustafa AbduljabbarNawras AlnaasanMohammadreza BayatpourChen Chun ChenNick Contini

Show All Co-Authors (20)

Kaushik Kandadi Suresh

Affiliation

Department of Computer Science and Engineering

The Ohio State University

Columbus, USA

Publication Topics

Message Passing Interface,
Message Size,
Data Transfer,
Memory Regions,
Benchmark,
Caching,
Communication Patterns,
Graphics Processing Unit,
High-performance Computing,
L2 Cache,
Problem Size,
Remote Processing

Biography

Kaushik Kandadi Suresh is a Ph.D. student at The Ohio State University, Columbus, OH, 43210-1277, USA, advised by Dr. D. K. Panda. His research focuses on optimizing CPU and GPU-based communication runtimes, such as MVAPICH2-X and MVAPICH2-GDR. Kandadi Suresh received a bachelor’s degree in electrical and electronics engineering from the National Institute of Technology Tiruchirappalli, Tiruchirappalli, India. He is a Graduate Student Member of IEEE. Contact him at kandadisuresh.1@osu.edu.(Based on document published on 31 January 2023).

Publications

13

Citations

25

Publications by Year

20202024

Co-Authors:

Mustafa Abduljabbar
Nawras Alnaasan
Mohammadreza Bayatpour
Chen Chun Chen
Nick Contini

Show All Co-Authors (20)

Author's Published Works

Search History

Showing 1-13 of 13 results

Conferences (12)

Magazines (1)

Sort

Filter Results

Show

Subscribed Content

Open Access Only

Range
Single Year
Kaushik Kandadi Suresh(13)
Hari Subramoni(12)
Bharath Ramesh(11)
Aamir Shafi(9)
Mustafa Abduljabbar(8)
Department of Computer Science and Engineering, The Ohio State University, Columbus, USA(10)
Ohio State Univ., Columbus, OH, USA(1)
Los Alamos National Laboratory(1)
Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, USA(1)
Los Alamos National Laboratory, Los Alamos, New Mexico, USA(1)
2023 IEEE Symposium on High-Performance Interconnects (HOTI)(2)
2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC)(2)
2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(1)
2020 Workshop on Exascale MPI (ExaMPI)(1)
2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC)(1)
IEEE(13)
Media(1)
CA, USA(3)
Bangalore, India(2)
Bengaluru, India(2)
St. Petersburg, FL, USA(2)
Atlanta, GA, USA(1)
Message Passing Interface(9)
Message Size(7)
Memory Regions(6)
Data Transfer(5)
High-performance Computing(5)

Select All on Page

Sort By

Results

Using BlueField-3 SmartNICs to Offload Vector Operations in Krylov Subspace Methods

Kaushik Kandadi Suresh;Benjamin Michalowicz;Nick Contini;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar Panda

2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2024 | Conference Paper |

HTML

Modern SmartNICs are capable of performing both computation and communication operations. In this context, past works on accelerating HPC/DL applications have manually selected some computational phases for offloading them to the SmartNICs. In this work, we identify Vector Multiply-Adds (VMA), Distributed Dot Products (DDOT), and Sparse Matrix-Vector Multiplication (Matvec) as three fundamental op...Show More

Using BlueField-3 SmartNICs to Offload Vector Operations in Krylov Subspace Methods

Kaushik Kandadi Suresh;Benjamin Michalowicz;Nick Contini;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar Panda

2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2024 | Conference Paper |

Effective and Efficient Offloading Designs for One-Sided Communication to SmartNICs

Ben Michalowicz;Kaushik Kandadi Suresh;Hari Subramoni;Mustafa Abduljabbar;Dhabaleswar K. Panda;Steve Poole

2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2024 | Conference Paper |

HTML

One-sided communication is one of many approaches to use for data transfer in High-Performance Computing (HPC) applications. One-sided operations require less demand on parallel programming libraries and do not require HPC hardware to issue acknowledgments of successful data transfer. Thanks to its inherently non-blocking nature, one-sided communication is also useful for improving overlap between...Show More

Effective and Efficient Offloading Designs for One-Sided Communication to SmartNICs

Ben Michalowicz;Kaushik Kandadi Suresh;Hari Subramoni;Mustafa Abduljabbar;Dhabaleswar K. Panda;Steve Poole

2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2024 | Conference Paper |

HINT: Designing Cache-Efficient MPI_Alltoall using Hybrid Memory Copy Ordering and Non-Temporal Instructions

Bharath Ramesh;Nick Contini;Nawras Alnaasan;Kaushik Kandadi Suresh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. D K Panda

2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Year: 2024 | Conference Paper |

HTML

Modern multi/many-core processors in HPC systems have hundreds of cores with deep memory hierarchies. HPC applications run at high core counts often experience contention between processes/threads on shared resources such as caches, leading to degraded performance. This is especially true for dense collective patterns, such as MPI_Alltoall, that have many concurrent memory transactions. The orderi...Show More

HINT: Designing Cache-Efficient MPI_Alltoall using Hybrid Memory Copy Ordering and Non-Temporal Instructions

Bharath Ramesh;Nick Contini;Nawras Alnaasan;Kaushik Kandadi Suresh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. D K Panda

2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Year: 2024 | Conference Paper |

Battle of the BlueFields: An In-Depth Comparison of the BlueField-2 and BlueField-3 SmartNICs

Benjamin Michalowicz;Kaushik Kandadi Suresh;Hari Subramoni;Dhabaleswar K. DK Panda;Steve Poole

2023 IEEE Symposium on High-Performance Interconnects (HOTI)

Year: 2023 | Conference Paper |

Cited by: Papers (4)

HTML

Over the past several years, Smart Network Interface Cards (NIC/SmartNICs) have rapidly evolved in popularity. In particular, NVIDIA’s BlueField line of SmartNICs has been effective in a wide variety of uses: Offloading communication in High-Performance Computing applications (HPC), various stages of the Deep Learning (DL) pipeline, and is designed especially for Datacenter/virtualization uses. Th...Show More

Battle of the BlueFields: An In-Depth Comparison of the BlueField-2 and BlueField-3 SmartNICs

Benjamin Michalowicz;Kaushik Kandadi Suresh;Hari Subramoni;Dhabaleswar K. DK Panda;Steve Poole

2023 IEEE Symposium on High-Performance Interconnects (HOTI)

Year: 2023 | Conference Paper |

Designing In-network Computing Aware Reduction Collectives in MPI

Bharath Ramesh;Goutham Kalikrishna Reddy Kuncham;Kaushik Kandadi Suresh;Rahul Vaidya;Nawras Alnaasan;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. DK Panda

2023 IEEE Symposium on High-Performance Interconnects (HOTI)

Year: 2023 | Conference Paper |

Cited by: Papers (1)

HTML

The Message-Passing Interface (MPI) provides convenient abstractions such as MPI_Allreduce for inter-process collective reduction operations. With the advent of deep learning and large-scale HPC systems, it is ever so important to optimize the latency of the MPI_Allreduce operation for large messages. Due to the amount of compute and communication involved in MPI_Allreduce, it is beneficial to off...Show More

Designing In-network Computing Aware Reduction Collectives in MPI

Bharath Ramesh;Goutham Kalikrishna Reddy Kuncham;Kaushik Kandadi Suresh;Rahul Vaidya;Nawras Alnaasan;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. DK Panda

2023 IEEE Symposium on High-Performance Interconnects (HOTI)

Year: 2023 | Conference Paper |

In-Depth Evaluation of a Lower-Level Direct-Verbs API on InfiniBand-based Clusters: Early Experiences

Benjamin Michalowicz;Kaushik Kandadi Suresh;Bharath Ramesh;Aamir Shafi;Hari Subramoni;Mustafa Abduljabbar;Dhabaleswar Panda

2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2023 | Conference Paper |

Cited by: Papers (1)

HTML

Many High-Performance Computing (HPC) clusters around the world use some variation of InfiniBand interconnects, all of which are powered by the ‘‘Verbs’’ API. Verbs supply a quick, efficient, and developer-friendly method of passing data buffers between nodes through their interconnect(s). In more recent years, the MLX5-DV (Direct Verbs) API has made itself known as a method of providing mechanism...Show More

In-Depth Evaluation of a Lower-Level Direct-Verbs API on InfiniBand-based Clusters: Early Experiences

Benjamin Michalowicz;Kaushik Kandadi Suresh;Bharath Ramesh;Aamir Shafi;Hari Subramoni;Mustafa Abduljabbar;Dhabaleswar Panda

2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2023 | Conference Paper |

A Novel Framework for Efficient Offloading of Communication Operations to Bluefield SmartNICs

Kaushik Kandadi Suresh;Benjamin Michalowicz;Bharath Ramesh;Nick Contini;Jinghan Yao;Shulei Xu;Aamir Shafi;Hari Subramoni;Dhabaleswar Panda

2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Year: 2023 | Conference Paper |

Cited by: Papers (6)

HTML

Smart Network Interface Cards (SmartNICs) such as NVIDIA’s BlueField Data Processing Units (DPUs) provide advanced networking capabilities and processor cores, enabling the offload of complex operations away from the host. In the context of MPI, prior work has explored the use of DPUs to offload non-blocking collective operations. The limitations of current state-of-the-art approaches are twofold:...Show More

A Novel Framework for Efficient Offloading of Communication Operations to Bluefield SmartNICs

Kaushik Kandadi Suresh;Benjamin Michalowicz;Bharath Ramesh;Nick Contini;Jinghan Yao;Shulei Xu;Aamir Shafi;Hari Subramoni;Dhabaleswar Panda

2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Year: 2023 | Conference Paper |

Network-Assisted Noncontiguous Transfers for GPU-Aware MPI Libraries

Kaushik Kandadi Suresh;Kawthar Shafie Khorassani;Chen Chun Chen;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. Panda

Year: 2023 | Volume: 43, Issue: 2 | Magazine Article |

HTML

The importance of graphics processing units (GPUs) in accelerating HPC applications is evident by the fact that a large number of supercomputing clusters are GPU enabled. Many of these HPC applications use message passing interface (MPI) as their programming model. These MPI applications frequently exchange data that is noncontiguous in GPU memory. MPI provides derived datatypes (DDTs) to represen...Show More

Network-Assisted Noncontiguous Transfers for GPU-Aware MPI Libraries

Kaushik Kandadi Suresh;Kawthar Shafie Khorassani;Chen Chun Chen;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. Panda

Year: 2023 | Volume: 43, Issue: 2 | Magazine Article |

Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters

Kaushik Kandadi Suresh;Akshay Paniraja Guptha;Benjamin Michalowicz;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar Panda

2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2022 | Conference Paper |

Cited by: Papers (1)

HTML

Graphics Processing Units (GPUs) have become ubiquitous in today’s supercomputing clusters primarily because of their high compute capability and power efficiency. Message Passing Interface (MPI) is a widely adopted programming model for large-scale GPU-based applications used in such clusters. Modern GPU-based systems have multiple HCAs. Previously, scientists have leveraged multi-HCA systems to ...Show More

Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters

Kaushik Kandadi Suresh;Akshay Paniraja Guptha;Benjamin Michalowicz;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar Panda

2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2022 | Conference Paper |

Network Assisted Non-Contiguous Transfers for GPU-Aware MPI Libraries

Kaushik Kandadi Suresh;Kawthar Shafie Khorassani;Chen Chun Chen;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. Panda

2022 IEEE Symposium on High-Performance Interconnects (HOTI)

Year: 2022 | Conference Paper |

Cited by: Papers (4)

HTML

The importance of GPUs in accelerating HPC applications is evident by the fact that a large number of super-computing clusters are GPU-enabled. Many of these HPC applications use MPI as their programming model. These MPI applications oftentimes exchange data that is non-contiguous in GPU memory. MPI provides Derived Datatypes(DDTs) to represent such data. In the past, researchers have proposed sol...Show More

Network Assisted Non-Contiguous Transfers for GPU-Aware MPI Libraries

Kaushik Kandadi Suresh;Kawthar Shafie Khorassani;Chen Chun Chen;Bharath Ramesh;Mustafa Abduljabbar;Aamir Shafi;Hari Subramoni;Dhabaleswar K. Panda

2022 IEEE Symposium on High-Performance Interconnects (HOTI)

Year: 2022 | Conference Paper |

Layout-aware Hardware-assisted Designs for Derived Data Types in MPI

Kaushik Kandadi Suresh;Bharath Ramesh;Chen Chun Chen;Seyedeh Mahdieh Ghazimirsaeed;Mohammadreza Bayatpour;Aamir Shafi;Hari Subramoni;Dhabaleswar K. Panda

2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2021 | Conference Paper |

Cited by: Papers (1)

HTML

Modern MPI-based scientific applications frequently use derived datatypes (DDT) for inter-process communication. Designing scalable solutions capable of dynamically adapting themselves to the complex communication requirements posed by DDT-based applications bring forth several new challenges. In this work, we address these challenges and propose solutions to efficiently improve the performance of...Show More

Layout-aware Hardware-assisted Designs for Derived Data Types in MPI

Kaushik Kandadi Suresh;Bharath Ramesh;Chen Chun Chen;Seyedeh Mahdieh Ghazimirsaeed;Mohammadreza Bayatpour;Aamir Shafi;Hari Subramoni;Dhabaleswar K. Panda

2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC)

Year: 2021 | Conference Paper |

Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System

Bharath Ramesh;Kaushik Kandadi Suresh;Nick Sarkauskas;Mohammadreza Bayatpour;Jahanzeb Maqbool Hashmi;Hari Subramoni;Dhabaleswar K. Panda

2020 Workshop on Exascale MPI (ExaMPI)

Year: 2020 | Conference Paper |

Cited by: Papers (3)

HTML

The Message-Passing Interface (MPI) is the de-facto standard for designing and executing applications on massively parallel hardware. MPI collectives provide a convenient abstraction for multiple processes/threads to communicate with one another. Mellanox's HDR InfiniBand switches pro-vide Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) capabilities to offload collective communica...Show More

Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System

Bharath Ramesh;Kaushik Kandadi Suresh;Nick Sarkauskas;Mohammadreza Bayatpour;Jahanzeb Maqbool Hashmi;Hari Subramoni;Dhabaleswar K. Panda

2020 Workshop on Exascale MPI (ExaMPI)

Year: 2020 | Conference Paper |

Performance Characterization of Network Mechanisms for Non-Contiguous Data Transfers in MPI

Kaushik Kandadi Suresh;Bharath Ramesh;Seyedeh Mahdieh Ghazimirsaeed;Mohammadreza Bayatpour;Jahanzeb Hashmi;Dhabaleswar K. Panda

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2020 | Conference Paper |

Cited by: Papers (4)

HTML

Message Passing Interface (MPI) is a very popular parallel programming model for developing parallel scientific applications. The complexity of data handled by scientific applications often results in their placement in non-contiguous locations in memory. In order to handle such complex and non-contiguous data, domain scientists often use user-defined datatypes that are supported by the MPI standa...Show More

Performance Characterization of Network Mechanisms for Non-Contiguous Data Transfers in MPI

Kaushik Kandadi Suresh;Bharath Ramesh;Seyedeh Mahdieh Ghazimirsaeed;Mohammadreza Bayatpour;Jahanzeb Hashmi;Dhabaleswar K. Panda

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2020 | Conference Paper |

IEEE Personal Account

Change username/password

Purchase Details

Payment Options
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

Follow

About IEEE Xplore | Contact Us | Help | Accessibility | Terms of Use | Nondiscrimination Policy | IEEE Ethics Reporting | Sitemap | IEEE Privacy Policy

A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

© Copyright 2025 IEEE - All rights reserved, including rights for text and data mining and training of artificial intelligence and similar technologies.

IEEE Account

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

About IEEE Xplore
Contact Us
Help
Accessibility
Terms of Use
Nondiscrimination Policy
Sitemap
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
© Copyright 2025 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.