IXPUG banner image

Vectorization is increasingly important to achieve high performance on modern hardware with SIMD instructions. Assembly of matrices and vectors in the finite element method, which is characterized by iterating a local assembly kernel over unstructured meshes, poses challenges to effective vectorization. Maintaining a user-friendly high-level interface with a suitable degree of abstraction while generating efficient, vectorized code for the finite element method is a challenge for numerical software systems and libraries. In this talk, we study the cross-element vectorization in the finite framework Firedrake and demonstrate the efficacy of such an approach by evaluating a wide range of matrix-free operators spanning different polynomial degrees and discretizations on two recent Intel CPUs using three mainstream compilers. Our experiments show that cross-element vectorization achieves 30% of theoretical peak performance for many examples of practical significance, and exceeds 50% for cases with high arithmetic intensities, with consistent speed-up over vectorization restricted to the local assembly kernels.

Event Name

IXPUG Webinar Series



Video Name

Webinar recording