IXPUG banner image

The Non-Equilibrium Green’s Function (NEGF) has been widely utilized in the field of nanoscience and nanotechnology to predict carrier transport behaviors in electronic device channels of sizes in a quantum regime. This work explores how much performance improvement can be driven for NEGF computations with unique features of manycore computing, where the core numerical step of NEGF computations involves a recursive process of matrix-matrix multiplication. The major techniques adopted for the performance enhancement are data-restructuring, matrix-tiling, thread-scheduling, and offload computing and we present in-depth discussion on why they are critical to fully exploit the power of manycore computing hardware including Intel Xeon Phi Knights Landing systems and NVIDIA general-purpose graphic processing unit (GPU) devices. Performance of the optimized algorithm has been tested in a single computing node, where the host is Xeon Phi 7210 that is equipped with two NVIDIA Quadro GV100 GPU devices. The target structure of NEGF simulations is a [100] silicon nanowire that consists of 100K atoms involving a 1000K×1000K complex Hamiltonian matrix. Through rigorous benchmark tests, we show, with optimization techniques whose details are elaborately explained, the workload can be accelerated almost by a factor of up to ?20 compared to the unoptimized case.

Event Name

IXPUG Workshop at HPC Asia 2021


Non-Equilibrium Green's Function (NEGF),Recursive Green's Function (RGF),MPI,OpenMP,Blocked Matrix Multiplication,Thread-scheduling

Video Name