Loading [MathJax]/extensions/MathMenu.js
iFDK: A Scalable Framework for Instant High-Resolution Image Reconstruction | IEEE Conference Publication | IEEE Xplore

iFDK: A Scalable Framework for Instant High-Resolution Image Reconstruction


Abstract:

Computed Tomography (CT) is a widely used technology that requires compute-intense algorithms for image reconstruction. We propose a novel back-projection algorithm that ...Show More

Abstract:

Computed Tomography (CT) is a widely used technology that requires compute-intense algorithms for image reconstruction. We propose a novel back-projection algorithm that reduces the projection computation cost to 1/6 of the standard algorithm. We also propose an efficient implementation that takes advantage of the heterogeneity of GPU-accelerated systems by overlapping the filtering and back-projection stages on CPUs and GPUs, respectively. Finally, we propose a distributed framework for high-resolution image reconstruction on state-of-the-art GPU-accelerated supercomputers. The framework relies on an elaborate interleave of MPI collective communication steps to achieve scalable communication. Evaluation on a single Tesla V100 GPU demonstrates that our backprojection kernel performs up to 1.6× faster than the standard FDK implementation. We also demonstrate the scalability and instantaneous CT capability of the distributed framework by using up to 2,048 V100 GPUs to solve 4K and 8K problems within 30 seconds and 2 minutes, respectively (including I/O).
Date of Conference: 17-22 November 2019
Date Added to IEEE Xplore: 04 March 2025
ISBN Information:

ISSN Information:

Conference Location: Denver, CO, USA
References is not available for this document.

1 Introduction

High-resolution Compute Tomography (CT) is a technology used in a wide variety of fields, e.g. medical diagnosis, non-invasive inspection [62], and reverse engineering [17], [50]. In the past decades, the size of a single three-dimensional (3D) volume generated by CT systems has increased from hundreds of megabytes (the typical sizes of a volume are 2563,5123 to several gigabytes (i.e. 20483, 40963) [7], [42], [66]. The increased demand for rapid tomography reconstruction and the associated high computational cost attracted heavy attention and efforts from the HPC community [8], [11], [19], [25], [28], [47], [54], [55], [66], [68], [76]. As illustrated in [48], the FDK

Feldkamp, Davis, and Kress [23] presented a convolution-backprojection formulation (known as FDK algorithm) for CT image reconstruction in 1984. FDK is also known as the Filtered Back Projection (FBP) algorithm.

algorithm is widely regarded as the primary method to reconstruct 3D images (or volumes) from projections, i.e. X-ray images. The FDK algorithm includes a filtering stage (also known as convolution) and a back-projection stage. The computational complexities of those two stages are and , respectively. Researchers are increasingly relying on the latest accelerators to improve the computational performance of FDK, e.g. Application Specific Integrated Circuits (ASIC) [72], Field-Programming Gate Array (FPGA) [16], [27], [64], [75], Digital Signal Processor (DSP) [37], Intel Xeon-Phi [53], Multi-core CPUs [68], and Graphics Processing Unit (GPU) [51], [73], [77], [78]. This paper focuses on GPU-accelerated supercomputers for two reasons. First, GPUs are dominantly used for tomographic image reconstruction [20], [28], [33], [55], [59], [74]. Second, GPU-accelerated supercomputers are increasingly gaining ground in top-tier HPC systems.

Select All
1.
Michael D Abràmoff, Paulo J Magalhães and Sunanda J Ram, "Image processing with ImageJ", Biophotonics international 11, vol. 7, no. 2004, pp. 36-42, 2004.
2.
K. Aditya Mohan, S. V. Venkatakrishnan, J. W. Gibbs, E. B. Gulsoy, X. Xiao, M. De Graef, et al., "TIMBIR: A Method for TimeSpace Reconstruction From Interlaced Views", IEEE Transactions on Computational Imaging, vol. 1, no. 2, pp. 96-111, June 2015.
3.
Anders H Andersen and Avinash C Kak, "Simultaneous algebraic reconstruction technique (SART): a superior implementation of the ART algorithm", Ultrasonic imaging 6, vol. 1, no. 1984, pp. 81-94, 1984.
4.
George B Arfken and Hans J Weber, Mathematical methods for physicists, 1999.
5.
Navid Asadizanjani, Sina Shahbazmohamadi, Mark Tehranipoor and Domenic Forte, "Non-destructive pcb reverse engineering using x-ray micro computed tomography", 41st International symposium for testing and failure analysis, pp. 1-5, 2015.
6.
Ammar Ahmad Awan, Ching-Hsiang Chu, Hari Subramoni, Xiaoyi Lu and Dhabaleswar K Panda, "OC-DNN: Exploiting Advanced Unified Memory Capabilities in CUDA 9 and Volta GPUs for Out-of-Core DNN Training", 2018 IEEE 25th International Conference on High Performance Computing (HiPC), pp. 143-152, 2018.
7.
Benjamin Betz, Steffen Kie, Michael Krumm, Gunnar Knupe, Tsegaye Eshete and Sven Simon, Efficient Data Structures for the Fast 3D Reconstruction of Voxel Volumes with Inhomogeneous Spatial Resolution., 2018.
8.
Tekin Bicer, Doga Gursoy, Raj Kettimuthu, Francesco De Carlo, Gagan Agrawal and Ian Foster, Rapid Tomographic Image Reconstruction via Large-Scale Parallelization, vol. 9233, pp. 289-302, 2015.
9.
Tekin Bicer, Doga Gursoy, Rajkumar Kettimuthu, Francesco De Carlo, Gagan Agrawal and Ian T Foster, "Rapid tomographic image reconstruction via large-scale parallelization", European Conference on Parallel Processing, pp. 289-302, 2015.
10.
Javier Garcia Blas, Monica Abella, Florin Isaila, Jesus Carretero and Manuel Desco, "Surfing the optimization space of a multiple-GPU parallel implementation of a X-ray tomography reconstruction algorithm", Journal of Systems and Software, vol. 95, pp. 166-175, 2014.
11.
Javier Garcia Blas, Florin Isaila, Monica Abella, Jesus Carretero, Ernesto Liria and Manuel Desco, "Parallel Implementation of a X-ray Tomography Reconstruction Algorithm Based on MPI and CUDA", Proceedings of the 20th European MPI Users' Group Meeting (EuroMPI ’13), pp. 217-222, 2013.
12.
Ronald Newbold Bracewell, Two-dimensional imaging, Prentice Hall Englewood Cliffs, vol. 247, 1995.
13.
E Oran Brigham and E Oran Brigham, The fast Fourier transform and its applications, NJ:prentice Hall Englewood Cliffs, vol. 448, 1988.
14.
Peng Chen, Mohamed Wahib, Shinichiro Takizawa, Ryousei Takano and Satoshi Matsuoka, "Efficient Algorithms for the Summed Area Tables Primitive on GPUs", 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp. 482-493, 2018.
15.
Paul Cockshott and Kenneth Renfrew, SIMD programming manual for Linux and Windows. Springer Science & Business Media, 2013.
16.
Srdjan Coric, Miriam Leeser, Eric Miller and Marc Trepanier, "Parallelbeam backprojection: an FPGA implementation optimized for medical imaging", Proceedings of the 2002 ACM/SIGDA tenth international symposium on Fieldprogrammable gate arrays, pp. 217-226, 2002.
17.
Industrial Computed Tomography, 2019, [online] Available: https://www.gom.com/metrology-systems/gom-ct.html.
18.
"CUDA Toolkit Documentation", NVIDIA Developer Zone, 2019, [online] Available: http://docs.nvidia.com/cuda/index.html.
19.
Jingyu Cui, Guillem Pratx, Bowen Meng and Craig S Levin, "Distributed MLEM: An iterative tomographic image reconstruction algorithm for distributed memory architectures", IEEE transactions on medical imaging 32, vol. 5, no. 2013, pp. 957-967, 2013.
20.
Philippe Despres and Xun Jia, "A review of GPU-based medical image reconstruction", Physica Medica, vol. 42, pp. 76-92, 2017.
21.
Harry E. Martz, Clint M. Logan, Daniel J. Schneberk and Peter J. Shull, X-ray imaging: fundamentals industrial techniques and applications, Boca Raton:CRC Press, Taylor & Francis Group, 2017.
22.
Liyong Fang, Hui Li, Jinping Bai and Bailin Li, "Application of industrial CT in reverse engineering technology", High Power Laser and Particle Beams 25, vol. 7, no. 2013, pp. 1620-1624, 2013.
23.
LA Feldkamp, LC Davis and JW Kress, "Practical cone-beam algorithm", JOSA A 1, vol. 6, pp. 612-619, 1984.
24.
Richard Gordon, Robert Bender and Gabor T Herman, "Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography", Journal of theoretical Biology 29, vol. 3, no. 1970, pp. 471-481, 1970.
25.
Jens Gregor, "Distributed CPU multi-core implementation of SIRT with vectorized matrix kernel for micro-CT", Proceedings of the 11th Fully ThreeDimensional Image Reconstruction in Radiology and Nuclear Medicine (2011), 2011.
26.
Devin Held, Analysis of 3D Cone-Beam CT Image Reconstruction Performance on a FPGA., 2016.
27.
I. Henry and Ming Chen, An FPGA Architecture for Real-Time 3-D Tomographic Reconstruction. Ph.D. Dissertation, 2012.
28.
M. Hidayetoglu, C. Pearson, I. El Hajj, L. Gurel, W. C. Chew and W. Hwu, "A Fast and Massively-Parallel Inverse Solver for Multiple-Scattering Tomographic Image Reconstruction", 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 64-74, 2018.
29.
Akira Hirakimoto, "Microfocus X-ray computed tomography and it's industrial applications", Analytical Sciences/Supplements Proceedings of IUPAC International Congress on Analytical Sciences 2001 (ICAS 2001), pp. i123-i125, 2002.
30.
Johannes Hofmann, Jan Treibig, Georg Hager and Gerhard Wellein, "Comparing the performance of different x86 SIMD instruction sets for a medical imaging application on modern multi-and manycore chips", Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing, pp. 57-64, 2014.

Contact IEEE to Subscribe

References

References is not available for this document.