Abstract:
Modern processors have multiple cores on a chip to overcome power consumption and heat dissipation issues. As more and more compute cores become available on a single nod...Show MoreMetadata
Abstract:
Modern processors have multiple cores on a chip to overcome power consumption and heat dissipation issues. As more and more compute cores become available on a single node, it is expected that node-local communication will play an increasingly greater role in overall performance of parallel applications such as MPI applications. It is therefore crucial to optimize intra-node communication paths utilized by MPI libraries. In this paper, we propose a novel design of a kernel extension, called LiMIC2, for high-performance MPI intra-node communication over multi-core systems. LiMIC2 can minimize the communication overheads by implementing lightweight primitives and provide portability across different interconnects and flexibility for performance optimization. Our performance evaluation indicates that LiMIC2 can attain 80% lower latency and more than three times improvement in bandwidth. Also the experimental results show that LiMIC2 can deliver bidirectional bandwidth greater than 11GB/s.
Published in: 2007 IEEE International Conference on Cluster Computing
Date of Conference: 17-20 September 2007
Date Added to IEEE Xplore: 19 September 2008
ISBN Information: