IXPUG banner image

The Energy Exascale Earth System Model (E3SM) is one of the top users of resources at NERSC, of which the Model for Prediction Across Scales - Ocean Core (MPAS-O) is a significant component, composed of 800,000 lines of Fortran and work by 50 contributors. When MPAS-O is migrated from the previous generation NERSC production system, Edison which hosts Ivy Bridge processors to the newer Knights Landing based Cori system, severe performance loss and scaling bottlenecks result. Performance analysis was used to reject a number of possible causes of this effect including load imbalance, cache behavior, and vectorization efficiency. It was found that a lower bound on the number of simulation cells mapped to an MPI rank combined with MPAS framework overhead caused by serialized thread structure is the overwhelming contributor to MPAS performance loss on Xeon Phi systems. Two framework optimizations which remove excessive thread barriers and recycle communications data structures have been incorporated into the E3SM master codebase for a 15% speed improvement when running MPAS-O at production scale on Xeon Phi processors.

Event Name

IXPUG Annual Fall Conference 2018


MPI,Climate and weather