|Chapter title||Improving the performance of Morton layout by array alignment and loop unrolling: reducing the price of naivety|
|Authors||Thiyagalingam, J., Beckmann, O. and Kelly, P.H.J.|
Hierarchically-blocked non-linear storage layouts, such as the Morton ordering, have been proposed as a compromise between row-major and column-major for two-dimensional arrays. Morton layout offers some spatial locality whether traversed row-wise or column-wise. The goal of this paper is to make this an attractive compromise, offering close to the performance of row-major traversal of row-major layout, while avoiding the pathological behaviour of column-major traversal. We explore how spatial locality of Morton layout depends on the alignment of the arrays base address, and how unrolling has to be aligned to reduce address calculation overhead. We conclude with extensive experimental results using five common processors and a small suite of benchmark kernels.
|Book title||Languages and Compilers for Parallel Computing: 16th International Workshop, LCPC 2003, College Station, TX, USA, October 2-4, 2003: revised papers|
|Place of publication||London, UK|
|Series||Lecture notes in computer science|
|Digital Object Identifier (DOI)||doi:10.1007/b95707|
|Journal citation||(2958), pp. 241-257|