Resources

We have collected presentations from IXPUG workshops, annual meetings, and BOF sessions, and made them accessible here to view or download. You may search by event, keyword, science domain or author’s name. The database will be updated as new talks are made available.

  • CategoriesClear All
    • Toggle ImageToggle Image
    • Toggle ImageToggle Image
    • Toggle ImageToggle Image
    • Toggle ImageToggle Image
    • Toggle ImageToggle Image
    • Toggle ImageToggle Image
    • Toggle ImageToggle Image
    • Toggle ImageToggle Image

Search ResultShowing 1 - 10 of 209 Results

IXPUG May'17 meeting Jul 13, 2018

Welcome and Opening Remarks

Keyword(s): ixpug

Author : Jim Jeffers
Read more | |
IXPUG ISC17 Jul 13, 2018

NERSC has partnered with over 20 representative application developer teams to evaluate and optimize their workloads on the Intel XeonPhi Knights Landing processor. In this paper, we present a summary of this two year effort and will present the lessons we learned in that process. We analyze the overall performance improvements of these codes quantifying impacts of bothXeonPhi architectural features as well as code optimization on application performance. We show average application speedups of greater than 3 on KNL and KNL vs Haswell node performance advantage across the applications.

Keyword(s): ixpug

Author : Richard Gerber, NERSC
Read more | |
IXPUG ISC17 Jul 13, 2018

welcome intro

Keyword(s): ixpug

Author : IXPUG User Group
Read more | |
IXPUG BoF SC16 Jul 13, 2018

"To achieve the best performance and energy efficiency, HPC applications may require tuning on new architectures like Knights Landing, Intel’s 2nd generation Xeon Phi processor. Furthermore, different applications with varying characteristics might benefit from variant configurations and tuning. In order to study the different types of applications, we select a molecular dynamics kernel, LeanMD and Stencil3D code from the Charm++ benchmark suite, which are representative of compute and communication intensive HPC benchmarks, respectively. We analyze these applications on different KNL configurations, namely MCDRAM usage mode for a given cluster mode and perform within node optimizations, specifically use of hyperthreading and enabling CPU affinity. With this tuning of applications based on their characteristics, we show a performance improvement of upto 1.8X when enabling hyperthreading and energy savings of nearly 10%."

Keyword(s): ixpug

Author : Kavitha Chandrasekar, Laxmikant V. Kale
Read more | |
IXPUG BoF SC16 Jul 13, 2018

Deep Neural Networks (DNN) gained significant importance in the recent years. The computational demands of DNN's are increasing due to more complicated networks and bigger datasets. We can say that deep learning entered the HPC era and thus new approaches are required in order to efficiently utilize massively parallel computing resources. The layers in the DNN have different types of time-dominant computational kernels. These kernels exert pressure on different hardware resources based on their computations type and their input/output size. The DNN architecture varies significantly across applications, making the automation of the DNN's performance understanding a necessity. The Roofline model is an excellent tool to understand the kernels' bottlenecks and the hardware utilization efficiency of the computational kernels. We present a performance engineering add-on to the famous Caffe DNN framework that utilizes the available hardware performance counters in contemporary processors to automatically generate the Roofline model and other useful measurements for each layer in the DNN, without adding significant runtime overhead. We show performance results of various DNN architectures in Intel Knights Landing many-core processors.

Keyword(s): ixpug

Author : Jack Deslippe, Tareq Malas, Thorsten Kurth
Read more | |
IXPUG BoF SC16 Jul 13, 2018

"In the present work, optimization of PCG (Preconditioned CG) method has been conducted on KNL cluster using OpenMP. Target application is a 3D static linear-elastic problem in solid mechanics, which is based on GeoFEM/Cube. We introduce the calculation-communication overlapping technique with dynamic scheduling for SpMV routine in a PCG iterative calculation. As KNL cluster, we use Oakforest-PACS system, which is introduced by JCAHPC (Joint Center for Advanced HPC) under the collaboration between ITC, University of Tokyo and CCS, University of Tsukuba, with 68core KNL for each node. We investigate the performance under various configurations of the memory mode and clustering mode, such as Flat+Quadrant, Cache+Quadrant, Flat+SNC-4, and Cache+SNC-4, using 32 nodes with MCDRAM only. In current results, we observed the best performance with 64 threads per MPI process on the Flat+Quadrant mode , and dynamic scheduling for OpenMP is effective on such high thread count configuration."

Keyword(s): ixpug

Author : Kengo Nakajima, Toshihiro Hanawa, Satoshi Ohshima
Read more | |
IXPUG BoF SC16 Jul 13, 2018

This talk will look at using different clustering modes and MCDRAM configurations for several important applications. While Quadrant mode seemed to be the best option early in our use of the KNL's, we are now seeing that SNC2 and SNC4 seem to deliver better performance. The difference in performance is extremely application dependent, which the talk will try to quantify. Use of MCDRAM is also application dependent. While the easiest approach is either using MCDRAM as cache or, if the application is small enough, running the application completely out of MCDRAM as a memory space, there are examples where splitting the MCDRAM between Cache and Flat has some advantages.

Keyword(s): ixpug

Author : John M Levesque
Read more | |
IXPUG BoF SC16 Jul 13, 2018

"? Knights Landing has a lot of possible configurations ? {Flat, Cache, Hybrid} x {All2All, Hemisphere, Quad, SNC-2, SNC-4} ? Impact of each upon performance will be application-specific ? Solution: engage with the IXPUG community! ? Learn from experiences with similar codes ? Get help with performance analysis/debugging ? Share ideas/source code/libraries for MCDRAM usage"

Keyword(s): ixpug

Author : John Pennycook
Read more | |
IXPUG Annual Meeting 2016 Jul 13, 2018

IXPUG Working Group

Keyword(s): ixpug

Author : Georg Zitzlsberger, John Pennycook, Michael Lysaght
Read more | |
IXPUG Annual Meeting 2016 Jul 13, 2018

Perforamance Tunning on KNL with Intel Parallel Studio XE 2017 Cluster Edition

Keyword(s): ixpug

Author : James Tullos
Read more | |