Field-programmable custom computing machines are normally composed of a host processor coupled as tightly as possible to accelerating hardware. The design task of an FCCM is normally a standard co-design flow of analyzing the computation bottlenecks, and implementing the expensive parts of the algorithm, leaving the rest in software. Often reference software exists to illustrate the operation of an algorithm. Whilst this reference software is not normally engineered for high-throughput, end-users will often demand high-performance without being prepared to do significant re-engineering of the code. Motion JPEG2000 [1] is an excellent example of this process. Reference software is freely available for JPEG2000 which on a modern processor can encode a 24 bit color image in around 8 seconds [2]. Using this code to encode a 50fps 1 minute video clip would take over 6 hours, which would rapidly become tedious.
Abstract:
The prevalence of software reference code motivates investigation into efficient implementations of software architectures on field-programmable devices. Modern FPGAs all...Show MoreMetadata
Abstract:
The prevalence of software reference code motivates investigation into efficient implementations of software architectures on field-programmable devices. Modern FPGAs allow designers to generate multi-processor architectures that exactly match the processing needs of the algorithm. This paper describes an architecture supporting the single program multiple data model of parallel processing, and presents results taken from a parallel implementation of the JPEG2000 encoding algorithm and Mandelbrot set generation.
Date of Conference: 20-23 April 2004
Date Added to IEEE Xplore: 13 December 2004
Print ISBN:0-7695-2230-0