Loading [MathJax]/extensions/MathZoom.js

Enrique S. Quintana-Ortí - IEEE Xplore Author Profile

IEEE.org
IEEE Xplore
IEEE SA
IEEE Spectrum
More Sites

- Donate
- Personal Sign In

Institutional Sign In

Institutional Sign In

ADVANCED SEARCH

Author details

Enrique S. Quintana-Ortí

Also published under: Enrique S. Quintana-Orti, E. S. Quintana-Ortí, E. S. Quintana-Orti, Enrique S. Quintana-Ortı, Enrique S. Quintana Orti

Publications

92

Citations

836

Publications by Year

19992024

Co-Authors:

J. I. AliagaM. T. AlonsoPedro AlonsoPedro Alonso-JordáHartwig Anzt

Show All Co-Authors (117)

Enrique S. Quintana-Ortí

Also published under: Enrique S. Quintana-Orti, E. S. Quintana-Ortí, E. S. Quintana-Orti, Enrique S. Quintana-Ortı, Enrique S. Quintana Orti

Affiliation

Universitat Politècnica de València

Publication Topics

Convolution Operation,
L2 Cache,
Matrix Multiplication,
Convolutional Neural Network,
Deep Learning,
Graphics Processing Unit,
High Performance,
Input Tensor,
Application Programming Interface,
Approximate Computation,
Arithmetic Operations,
Augmented Matrix

Biography

Enrique S. Quintana-Ortí received the BS degree in computer science from the Polytechnic University of Valencia, Spain, in 1992. He also received the PhD degree from the same university in 1996. He is currently a professor of computer architecture at the Jaume I University, Castellón, Spain. During these years, he has authored 200+ papers in journals and international conferences in the area of parallel scientific computing. His current research interests target the development of high performance computational methods for modern architectures, including graphics processors, multicore architectures, digital signal processors, and clusters. He is currently subject area editor for the journal Parallel Computing.(Based on document published on 14 January 2015).

Publications

92

Citations

836

Publications by Year

19992024

Co-Authors:

J. I. Aliaga
M. T. Alonso
Pedro Alonso
Pedro Alonso-Jordá
Hartwig Anzt

Show All Co-Authors (117)

Author's Published Works

Search History

Showing 1-25 of 92 results

Conferences (80)

Journals (12)

Sort

Filter Results

Show

Open Access Only

Range
Single Year
Enrique S. Quintana-Ortí(55)
Rafael Mayo(23)
Enrique S. Quintana-Orti(18)
Adrián Castelló(14)
Francisco D. Igual(11)
Instituto de Computación, Universidad de la República, Montevideo, Uruguay(5)
Departamento de Ingeniería y Ciencia de Computadores, Universidad Jaume I, Castellon, Spain(5)
Universitat Politècnica de València, Spain(4)
Universitat Politècnica de València, València, Spain(4)
Depto. de Ingeniería y Ciencia de Computadores, Universidad Jaume I, Castellón, Spain(4)
2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing(3)
2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications(3)
2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(3)
2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)(3)
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing(3)
IEEE(92)
Media(1)
Valladolid, Spain(5)
Anchorage, AK, USA(3)
Bristol, UK(3)
Lake Buena Vista, FL, USA(3)
Leganes, Spain(3)
Parallelization(43)
Linear Algebra(35)
Graphics Processing Unit(29)
Multi-core(26)
Linear System(24)

Select All on Page

Sort By

Results

QAttn: Efficient GPU Kernels for mixed-precision Vision Transformers

Piotr Kluska;Adrián Castelló;Florian Scheidegger;A. Cristiano I. Malossi;Enrique S. Quintana-Ortí

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Year: 2024 | Conference Paper |

HTML

Vision Transformers have demonstrated outstanding performance in Computer Vision tasks. Nevertheless, this superior performance for large models comes at the expense of increasing memory usage for storing the parameters and intermediate activations. To accelerate model inference, in this work we develop and evaluate integer and mixed-precision kernels in Triton for the efficient execution of two f...Show More

QAttn: Efficient GPU Kernels for mixed-precision Vision Transformers

Piotr Kluska;Adrián Castelló;Florian Scheidegger;A. Cristiano I. Malossi;Enrique S. Quintana-Ortí

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Year: 2024 | Conference Paper |

Communication-Avoiding Fusion of GEMM-Based Convolutions for Deep Learning in the RISC-V GAP8 MCU

Cristian Ramírez;Adrián Castelló;Héctor Martínez;Enrique S. Quintana-Ortí

IEEE Internet of Things Journal

Year: 2024 | Volume: 11, Issue: 21 | Journal Article |

HTML

Incorporating deep learning (DL) technologies to the edge is crucial for improving the security, privacy, and energy efficiency of the Internet of Things (IoT). In this scenario, the limitations of edge devices in terms of power dissipation, memory capacity, and processing power require a careful selection and optimization of algorithms for IoT DL applications. In this line, our work focuses on th...Show More

Communication-Avoiding Fusion of GEMM-Based Convolutions for Deep Learning in the RISC-V GAP8 MCU

Cristian Ramírez;Adrián Castelló;Héctor Martínez;Enrique S. Quintana-Ortí

IEEE Internet of Things Journal

Year: 2024 | Volume: 11, Issue: 21 | Journal Article |

TEFNEN: Transformer for Energy Forecasting in Natural Environment

Samuel Domínguez-Cid;Diego F. Larios;Julio Barbancho;Alberto Gonzalez Salvador;Enrique S Quintana-Ortí;Carlos León

2023 3rd International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)

Year: 2023 | Conference Paper |

Cited by: Papers (2)

HTML

Photovoltaic systems are being used in almost every field such as smart cities, Internet of Things paradigms or remote Wireless Sensor Networks. In Internet of things paradigms deployed in natural environments, energy harvesting technology is crucial to power the devices. For the energy management system, it is important to predict how much energy can be harvested from the environment. In this wor...Show More

TEFNEN: Transformer for Energy Forecasting in Natural Environment

Samuel Domínguez-Cid;Diego F. Larios;Julio Barbancho;Alberto Gonzalez Salvador;Enrique S Quintana-Ortí;Carlos León

2023 3rd International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)

Year: 2023 | Conference Paper |

Towards Benchmarking GNSS Algorithms on FPGA using SyDR

Antoine Grenier;Hans Jakob Damsgaard;Jie Lei;Enrique S. Quintana-Ortí;Aleksandr Ometov;Elena Simona Lohan;Jari Nurmi

2023 International Conference on Localization and GNSS (ICL-GNSS)

Year: 2023 | Conference Paper |

Cited by: Papers (1)

HTML

Global Navigation Satellite System (GNSS) is widely used today for both positioning and timing purposes. Many distinct receiver chips are available off-the-shelf, each tailored to match various applications’ requirements. Being implemented as Application-Specific Integrated Circuits, these chips provide good performance and low energy consumption but must be treated as "black boxes" by customers. ...Show More

Towards Benchmarking GNSS Algorithms on FPGA using SyDR

Antoine Grenier;Hans Jakob Damsgaard;Jie Lei;Enrique S. Quintana-Ortí;Aleksandr Ometov;Elena Simona Lohan;Jari Nurmi

2023 International Conference on Localization and GNSS (ICL-GNSS)

Year: 2023 | Conference Paper |

Toward Matrix Multiplication for Deep Learning Inference on the Xilinx Versal

Jie Lei;José Flich;Enrique S. Quintana-Ortí

2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2023 | Conference Paper |

Cited by: Papers (1)

HTML

The remarkable positive impact of Deep Neural Networks on many Artificial Intelligence (AI) tasks has led to the development of various high performance algorithms as well as specialized processors and accelerators. In this paper we address this scenario by demonstrating that the principles underlying the modern realization of the general matrix multiplication (GEMM) in conventional processor arch...Show More

Toward Matrix Multiplication for Deep Learning Inference on the Xilinx Versal

Jie Lei;José Flich;Enrique S. Quintana-Ortí

2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2023 | Conference Paper |

Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor

Manuel F. Dolz;Héctor Martínez;Pedro Alonso;Enrique S. Quintana-Ortí

2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2022 | Conference Paper |

Cited by: Papers (1)

HTML

The convolution operator is a crucial kernel for many computer vision and signal processing applications that rely on deep learning (DL) technologies. As such, the efficient implementation of this operator has received considerable attention in the past few years for a fair range of processor architectures. In this paper, we follow the technology trend toward integrating long SIMD (single instruct...Show More

Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor

Manuel F. Dolz;Héctor Martínez;Pedro Alonso;Enrique S. Quintana-Ortí

2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2022 | Conference Paper |

NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors

Sandra Catalán;Francisco D. Igual;Rafael Rodríguez-Sánchez;José R. Herrero;Enrique S. Quintana-Ortí

2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2022 | Conference Paper |

HTML

We address the efficient design and implementation of dense matrix factorizations and inversion (DMFI) on modern multicore processors with several NUMA (non-uniform memory access) nodes. Our approach enhances the DMFI routines with a look-ahead strategy, in order to overcome the “panel factorization bottleneck”. In addition, it exploits both hybrid task- and loop-level parallelizations while takin...Show More

NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors

Sandra Catalán;Francisco D. Igual;Rafael Rodríguez-Sánchez;José R. Herrero;Enrique S. Quintana-Ortí

2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2022 | Conference Paper |

Towards Portable Realizations of Winograd-based Convolution with Vector Intrinsics and OpenMP

Manuel F. Dolz;Adrián Castelló;Enrique S. Quintana-Ortí

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)

Year: 2022 | Conference Paper |

Cited by: Papers (3)

HTML

We take a step forward in the direction of developing high performance codes for the convolution, based on the Winograd transformation, that are easy to customize for different processor architectures. In our approach, augmenting the portability of the solution is achieved via the introduction of vector intrinsics to exploit the SIMD (single-instruction multiple-data) capabilities of current proce...Show More

Towards Portable Realizations of Winograd-based Convolution with Vector Intrinsics and OpenMP

Manuel F. Dolz;Adrián Castelló;Enrique S. Quintana-Ortí

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)

Year: 2022 | Conference Paper |

Anatomy of the BLIS Family of Algorithms for Matrix Multiplication

Adrián Castelló;Enrique S. Quintana-Ortí;Francisco D. Igual

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)

Year: 2022 | Conference Paper |

Cited by: Papers (10)

HTML

The efforts of the scientific community and hardware vendors to develop and optimize linear algebra codes have historically led to highly-tuned libraries, carefully adapted to the underlying processor architecture, with excellent (near-peak) performance. These optimization efforts, however, are commonly focused on obtaining the best performance possible when the involved operands are large and “sq...Show More

Anatomy of the BLIS Family of Algorithms for Matrix Multiplication

Adrián Castelló;Enrique S. Quintana-Ortí;Francisco D. Igual

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)

Year: 2022 | Conference Paper |

Scalable Hybrid Loop- and Task-Parallel Matrix Inversion for Multicore Processors

Sandra Catalán;Francisco D. Igual;Rafael Rodríguez-Sanchez;Enrique S. Quintana-Ortí

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2021 | Conference Paper |

Cited by: Papers (2)

HTML

We propose a hybrid parallelization scheme for matrix inversion on multicore processors that combines a look-ahead technique to extract task-parallelism, at a high level, with loop-level parallelism to ensure an efficient utilization of the processor memory subsystem. As a result, our scheme outperforms the conventional approach for dense linear algebra operations, which simply extracts parallelis...Show More

Scalable Hybrid Loop- and Task-Parallel Matrix Inversion for Multicore Processors

Sandra Catalán;Francisco D. Igual;Rafael Rodríguez-Sanchez;Enrique S. Quintana-Ortí

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2021 | Conference Paper |

High Performance and Energy Efficient Integer Matrix Multiplication for Deep Learning

Pau San Juan;Pedro Alonso-Jordá;Enrique S. Quintana-Ortí

2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2021 | Conference Paper |

HTML

We present a multi-threaded implementation of the matrix multiplication for deep learning on ARM multicore processors. Following standard practice for inference with convolutional neural networks, our GEMM kernel operates with 16-bit integer arithmetic, yielding significant performance acceleration and cutting the memory requirements with respect to IEEE (floating point) single precision by half, ...Show More

High Performance and Energy Efficient Integer Matrix Multiplication for Deep Learning

Pau San Juan;Pedro Alonso-Jordá;Enrique S. Quintana-Ortí

2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2021 | Conference Paper |

Performance Modeling for Distributed Training of Convolutional Neural Networks

Adrián Castelló;Mar Catalán;Manuel F. Dolz;Jose I. Mestre;Enrique S. Quintana-Ortí;José Duato

2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2021 | Conference Paper |

Cited by: Papers (2)

HTML

We perform a theoretical analysis comparing the scalability of data versus model parallelism, applied to the distributed training of deep convolutional neural networks (CNNs), along five axes: batch size, node (floating-point) arithmetic performance, node memory bandwidth, network link bandwidth, and cluster dimension. Our study relies on analytical performance models that can be configured to rep...Show More

Performance Modeling for Distributed Training of Convolutional Neural Networks

Adrián Castelló;Mar Catalán;Manuel F. Dolz;Jose I. Mestre;Enrique S. Quintana-Ortí;José Duato

2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2021 | Conference Paper |

Evaluation of MPI Allreduce for Distributed Training of Convolutional Neural Networks

Adrián Castelló;Mar Catalán;Manuel F. Dolz;José I. Mestre;Enrique S. Quintana-Ortí;José Duato

2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2021 | Conference Paper |

Cited by: Papers (2)

HTML

Training deep neural networks is a costly procedure, often performed via sophisticated deep learning frameworks on clusters of computers. As faster processor technologies are integrated into these cluster facilities (e.g., NVIDIA’s graphics accelerators or Google’s tensor processing units), the communication component of the training process rapidly becomes a performance bottleneck. In this paper,...Show More

Evaluation of MPI Allreduce for Distributed Training of Convolutional Neural Networks

Adrián Castelló;Mar Catalán;Manuel F. Dolz;José I. Mestre;Enrique S. Quintana-Ortí;José Duato

2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Year: 2021 | Conference Paper |

DMRlib: Easy-Coding and Efficient Resource Management for Job Malleability

Sergio Iserte;Rafael Mayo;Enrique S. Quintana-Ortí;Antonio J. Peña

IEEE Transactions on Computers

Year: 2021 | Volume: 70, Issue: 9 | Journal Article |

Cited by: Papers (16)

HTML

Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the non-negligible additional development effort this solution imposes has constrained its adoption by the scientific programming community. In this work, we present DMRlib, a library designed...Show More

DMRlib: Easy-Coding and Efficient Resource Management for Job Malleability

Sergio Iserte;Rafael Mayo;Enrique S. Quintana-Ortí;Antonio J. Peña

IEEE Transactions on Computers

Year: 2021 | Volume: 70, Issue: 9 | Journal Article |

High Performance and Portable Convolution Operators for Multicore Processors

Pablo San Juan;Adrián Castelló;Manuel F. Dolz;Pedro Alonso-Jordá;Enrique S. Quintana-Ortí

2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2020 | Conference Paper |

Cited by: Papers (16)

HTML

The considerable impact of Convolutional Neural Networks on many Artificial Intelligence tasks has led to the development of various high performance algorithms for the convolution operator present in this type of networks. One of these approaches leverages the IM2COL transform followed by a general matrix multiplication (GEMM) in order to take advantage of the highly optimized realizations of the...Show More

High Performance and Portable Convolution Operators for Multicore Processors

Pablo San Juan;Adrián Castelló;Manuel F. Dolz;Pedro Alonso-Jordá;Enrique S. Quintana-Ortí

2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2020 | Conference Paper |

Tiled Algorithms for Efficient Task-Parallel ℌ-Matrix Solvers

Rocío Carratalá-Sáez;Mathieu Faverge;Grégoire Pichon;Guillaume Sylvand;Enrique S. Quintana-Ortí

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2020 | Conference Paper |

Cited by: Papers (5)

HTML

In this paper, we describe and evaluate an extension of the CHAMELEON library to operate with hierarchical matrices (H-Matrices) and hierarchical arithmetic (H-Arithmetic), producing efficient solvers for linear systems arising in Boundary Element Methods (BEM). Our approach builds upon an open-source H -Matrices library from Airbus, named HMAT-OSS, that collects sequential numerical kernels for b...Show More

Tiled Algorithms for Efficient Task-Parallel ℌ-Matrix Solvers

Rocío Carratalá-Sáez;Mathieu Faverge;Grégoire Pichon;Guillaume Sylvand;Enrique S. Quintana-Ortí

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Year: 2020 | Conference Paper |

Analysis of Threading Libraries for High Performance Computing

Adrián Castelló;Rafael Mayo Gual;Sangmin Seo;Pavan Balaji;Enrique S. Quintana-Ortí;Antonio J. Peña

IEEE Transactions on Computers

Year: 2020 | Volume: 69, Issue: 9 | Journal Article |

Cited by: Papers (3)

HTML

With the appearance of multi-/many core machines, applications and runtime systems have evolved in order to exploit the new on-node concurrency brought by new software paradigms. POSIX threads (Pthreads) was widely-adopted for that purpose and it remains as the most used threading solution in current hardware. Lightweight thread (LWT) libraries emerged as an alternative offering lighter mechanisms...Show More

Analysis of Threading Libraries for High Performance Computing

Adrián Castelló;Rafael Mayo Gual;Sangmin Seo;Pavan Balaji;Enrique S. Quintana-Ortí;Antonio J. Peña

IEEE Transactions on Computers

Year: 2020 | Volume: 69, Issue: 9 | Journal Article |

Automatic Selection of Sparse Triangular Linear System Solvers on GPUs through Machine Learning Techniques

Ernesto Dufrechou;Pablo Ezzatti;Enrique S. Quintana-Orti

2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2019 | Conference Paper |

Cited by: Papers (3)

HTML

The solution of sparse triangular linear systems is often the most time-consuming stage of preconditioned iterative methods to solve general sparse linear systems, where it has to be applied several times for the same sparse matrix. For this reason, its computational performance has a strong impact on a wide range of scientific and engineering applications, which has motivated the study of its eff...Show More

Automatic Selection of Sparse Triangular Linear System Solvers on GPUs through Machine Learning Techniques

Ernesto Dufrechou;Pablo Ezzatti;Enrique S. Quintana-Orti

2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2019 | Conference Paper |

Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks

Adrián Castelló;Manuel F. Dolz;Enrique S. Quintana-Ortí;José Duato

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Year: 2019 | Conference Paper |

Cited by: Papers (9)

HTML

We analyze the asymptotic performance of the training process of deep neural networks (NN) on clusters in order to determine the scalability. For this purpose, i) we assume a data parallel implementation of the training algorithm, which distributes the batches among the cluster nodes and replicates the model; ii) we leverage the roofline model to inspect the performance at the node level, taking i...Show More

Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks

Adrián Castelló;Manuel F. Dolz;Enrique S. Quintana-Ortí;José Duato

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Year: 2019 | Conference Paper |

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization With Partial Pivoting

Sandra Catalán;José R. Herrero;Enrique S. Quintana-Ortí;Rafael Rodríguez-Sánchez;Robert Van De Geijn

Year: 2019 | Volume: 7 | Journal Article |

Cited by: Papers (10)

HTML

We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques target the scenario where two thread teams are created/activated during the factorization, with each team in charge of performing an independent task/branch of execution. The first tec...Show More

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization With Partial Pivoting

Sandra Catalán;José R. Herrero;Enrique S. Quintana-Ortí;Rafael Rodríguez-Sánchez;Robert Van De Geijn

Year: 2019 | Volume: 7 | Journal Article |

Extending ILUPACK with a GPU Version of the BiCGStab Method

José Aliaga;Enrique S. Quintana-Ortí;Ernesto Dufrechou;Pablo Ezzatti

2018 XLIV Latin American Computer Conference (CLEI)

Year: 2018 | Conference Paper |

HTML

The solution of sparse linear systems of large dimension is a important stage in problems that span a diverse kind of applications. For this reason, a number of iterative solvers have been developed, among which ILUPACK integrates an inverse-based multilevel ILU preconditioner with appealing numerical properties. In this work we extend the iterative methods available in ILUPACK. Concretely, we dev...Show More

Extending ILUPACK with a GPU Version of the BiCGStab Method

José Aliaga;Enrique S. Quintana-Ortí;Ernesto Dufrechou;Pablo Ezzatti

2018 XLIV Latin American Computer Conference (CLEI)

Year: 2018 | Conference Paper |

High-Performance GPU Implementation of PageRank with Reduced Precision Based on Mantissa Segmentation

Thomas Grützmacher;Hartwig Anzt;Florian Scheidegger;Enrique S. Quintana-Orti

2018 IEEE/ACM 8th Workshop on Irregular Applications: Architectures and Algorithms (IA3)

Year: 2018 | Conference Paper |

Cited by: Papers (3)

HTML

We address the acceleration of the PageRank al- gorithm for web information retrieval on graphics processing units (GPUs) via a modular precision framework that adapts the data format in memory to the numerical requirements as the iteration converges. In detail, we abandon the IEEE 754 single- and double-precision number representation formats, employed in the standard implementation of PageRank, ...Show More

High-Performance GPU Implementation of PageRank with Reduced Precision Based on Mantissa Segmentation

Thomas Grützmacher;Hartwig Anzt;Florian Scheidegger;Enrique S. Quintana-Orti

2018 IEEE/ACM 8th Workshop on Irregular Applications: Architectures and Algorithms (IA3)

Year: 2018 | Conference Paper |

Fast Blocking of Householder Reflectors on Graphics Processors

Andrés E. Tomás Dominguez;Enrique S. Quintana Orti

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)

Year: 2018 | Conference Paper |

Cited by: Papers (3)

HTML

We revisit an alternative representation to the compact WY transform for the accumulation (blocking) of Householder reflectors that exhibits the same numerical stability and is composed of efficient computational kernels from Level-3 Basic Linear Algebra Subprograms (BLAS) in contrast with the Level-2 BLAS that are utilized for the construction of the conventional compact WY representation. For th...Show More

Fast Blocking of Householder Reflectors on Graphics Processors

Andrés E. Tomás Dominguez;Enrique S. Quintana Orti

2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)

Year: 2018 | Conference Paper |

Optimized Fundamental Signal Processing Operations For Energy Minimization on Heterogeneous Mobile Devices

Jose A. Belloch;José M. Badía;Francisco D. Igual;Alberto Gonzalez;Enrique S. Quintana-Ortí

IEEE Transactions on Circuits and Systems I: Regular Papers

Year: 2018 | Volume: 65, Issue: 5 | Journal Article |

Cited by: Papers (4)

HTML

Numerous signal processing applications are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness constraints for user interactivity and, at the same time, must be optimized for energy efficiency. The increasingly heterogeneous power-versus-performance profile of modern hardware introduces new opportunities for energy savings as well as ch...Show More

Optimized Fundamental Signal Processing Operations For Energy Minimization on Heterogeneous Mobile Devices

Jose A. Belloch;José M. Badía;Francisco D. Igual;Alberto Gonzalez;Enrique S. Quintana-Ortí

IEEE Transactions on Circuits and Systems I: Regular Papers

Year: 2018 | Volume: 65, Issue: 5 | Journal Article |

Overcoming Memory-Capacity Constraints in the Use of ILUPACK on Graphics Processors

José I. Aliaga;Ernesto Dufrechou;Pablo Ezzatti;Enrique S. Quintana-Ortí

2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2017 | Conference Paper |

HTML

An important number of scientific and engineering problems currently require the solution of large and sparse linear systems of equations. In previous work, we applied a GPU accelerator to the solution of sparse linear systems of moderate dimension via ILUPACK, showing important reductions in the execution time while maintaining the quality of the solution. Unfortunately, the use of GPUs attached ...Show More

Overcoming Memory-Capacity Constraints in the Use of ILUPACK on Graphics Processors

José I. Aliaga;Ernesto Dufrechou;Pablo Ezzatti;Enrique S. Quintana-Ortí

2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Year: 2017 | Conference Paper |

IEEE Personal Account

Change username/password

Purchase Details

Payment Options
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

Follow

About IEEE Xplore | Contact Us | Help | Accessibility | Terms of Use | Nondiscrimination Policy | IEEE Ethics Reporting | Sitemap | IEEE Privacy Policy

A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

© Copyright 2025 IEEE - All rights reserved, including rights for text and data mining and training of artificial intelligence and similar technologies.

IEEE Account

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests

Need Help?

US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support

About IEEE Xplore
Contact Us
Help
Accessibility
Terms of Use
Nondiscrimination Policy
Sitemap
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
© Copyright 2025 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.