Efficient Hardware Design Of Iterative Stencil Loops

Rana, V., Beretta, I., Bruschi, F., Nacci, A. A., Atienza, D. and Sciuto, D. 2016. Efficient Hardware Design Of Iterative Stencil Loops. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 35 (12), pp. 2018-2031. https://doi.org/10.1109/TCAD.2016.2545408

TitleEfficient Hardware Design Of Iterative Stencil Loops
AuthorsRana, V., Beretta, I., Bruschi, F., Nacci, A. A., Atienza, D. and Sciuto, D.
Abstract

A large number of algorithms for multidimensional signals processing and scientific computation come in the form of iterative stencil loops (ISLs), whose data dependencies span across multiple iterations. Because of their complex inner structure, automatic hardware acceleration of such algorithms is traditionally considered as a difficult task.

In this paper, we introduce an automatic design flow that identifies, in a wide family of bidimensional data processing algorithms, sub-portions that exhibit a kind of parallelism close to that of ISLs; these are mapped onto a space of highly optimized ad-hoc architectures, which is efficiently explored to identify the best implementations with respect to both area and throughput. Experimental results show that the proposed methodology generates circuits whose performance is comparable to that of manually-optimized solutions, and orders of magnitude higher than those generated by commercial
HLS tools.

KeywordsHigh-level synthesis
Optimisation
FPGA
Dataflow computing
Embedded systems
Multimedia processing
Iterative functions
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Journal citation35 (12), pp. 2018-2031
ISSN0278-0070
1937-4151
Year2016
PublisherIEEE
Accepted author manuscript
Digital Object Identifier (DOI)https://doi.org/10.1109/TCAD.2016.2545408
Publication dates
Published online22 Mar 2016
Published22 Mar 2016

Related outputs

Parallelizing the Chambolle Algorithm for Performance-Optimized Mapping on FPGA Devices
Beretta, I., Rana, V., Akin, A., Nacci, A. A., Sciuto, D. and Atienza, D. 2016. Parallelizing the Chambolle Algorithm for Performance-Optimized Mapping on FPGA Devices. ACM Transactions on Embedded Computing Systems (TECS) . 15 (3), p. Article No. 44 44. https://doi.org/10.1145/2851497

Permalink - https://westminsterresearch.westminster.ac.uk/item/9yx06/efficient-hardware-design-of-iterative-stencil-loops


Share this

Usage statistics

57 total views
297 total downloads
These values cover views and downloads from WestminsterResearch and are for the period from September 2nd 2018, when this repository was created.