Workshop and BOF at ISC



KNL Access and Reporting of KNL Results

Please check out how to get access to KNLs

  • Software Developer Platforms available for purchase - see on the resources page of  Click here for a 1-page summary.

  • Limited time on a remote KNL.  Please complete the form by clicking here.  Compute time of 4 hours or less, and work for submissions to ISC will be given higher priority.  Once an access decision is made, you'll be notified by email.  You will need an NDA in place to be granted access.

We'll help you get your results data reviewed by Intel for any public release before ISC. You may collect and share KNL data at ISC. You are allowed to submit NDA data to EasyChair without additional review.


TACC KNL Tutorial 

Date: June 19, 2016

Time: 8:00am – 12:00pm

Location: Frankfurt Hotel Marriott (Hamburger Allee 2 Frankfurt Hessen 60486 Germany)

Registration: See

The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, in partnership with Intel Corporation, is proud to  announce a special tutorial event as part of the Intel Xeon Phi User Group (IXPUG) Meeting at ISC16 in Frankfurt, Germany. This event will provide hands-on experience on the latest Intel product, the second generation Intel Xeon Phi, also known as Knights Landing (KNL). Attendees to this half-day event will be some of the first researchers in the world to have access to the KNL platform.

Hands-on exercises will be executed on the Stampede system at TACC. Stampede has been updated with KNL processors, which will be made available to the general public later this year.

This tutorial will focus on the use of reports and directives to improve vectorization and the implementation of proper memory access, and showcase new Intel VTune Amplifier XE capabilities that allow for in-depth memory access analysis and hybrid code profiling.

The first forty registered attendees through the door will also receive a copy of the new book: Intel® Xeon Phi Processor High Performance Programming, Knights Landing Edition, by Jim Jeffers, James Reinders, and Avinash Sodani.

**NOTE: This training is being offered at the Frankfurt Marriott, rather than at the ISC Conference Location.** 



IXPUG Birds of a Feather: 
BOF 13: Gearing Up Application Performance for Intel Xeon Phi (KNL) Supercomputers

Location: Frankfurt am Main, Germany
Date: Wednesday, June 22, 2016, 8:30-9:30 am (Frankfurt time)
Venue: Substanz 1+2, Forum


  • Please start making submissions now.  Reviewers will start reviewing by May 16.  Late submissions will be considered. Talk selection on June 3rd.
  • Submission URL: EasyChair IXPUG Workshop ISC16.  Please select BoF only, in the Topics selection box.
  • Presentation template: IXPUG PowerPoint template - we've found that those who include content in this template have the smoothest reviews

KNL Demo Hardware

During this BOF 13 and the following BOF 14, KNL demo hardware is shown inside the room to present performance for selected benchmark cases.


    Topic Presentation
08:30 08:35 Opening Welcome (Richard Gerber, President of IXPUG)
08:35 08:40 Welcome Overview Intel Xeon Phi - Knights Landing (Avinash Sodani, Intel Xeon Phi KNL Architect)
08:40 08:50 IXPUG WGs IXPUG Working Discussion (M. Lysaght, ICHEC)
08:50 08:55 The Vectorization Working Group (see part 2) (G. Zitzlsberger, Intel)
08:55 09:10 Lightning Talks Particle-in-Cell Plasma Simulation on KNC & First Results on KNL (I. Meyerov, Univ. Nizhni Novgorod)
Improving the performance of a Gadget kernel on many-core systems – from KNC to KNL (L. Iapichino, F. Baruffa, LRZ)
Modernisation of the AVBP code for KNL. Performance and optimisation tips using Intel advisor (G. Staffelbach, CERFACS)
09:10 09:30 Tools Talks &
Profiling Tools for KNL (C. Rosales-Fernandez, A. Gómez-Iglesias, TACC)
Cray Tools for KNL (St. Andersson, Cray)
Open Discussion (including having Intel & Cray tool/compiler people there)
09:30   Closing Final Remarks (IXPUG)



IXPUG Workshop
"Application  Performance on Intel Xeon Phi –
Being Prepared for KNL and Beyond"

Location: Frankfurt am Main, Germany
Date: Thursday, June 23, 2016, 8:30am-6:00pm
Venue: Marriott Frankfurt Hotel

The workshop will bring together software developers and technology experts to share challenges, experiences and best-practice methods for the optimization of HPC workloads on the Intel Xeon Phi. The workshop will cover application performance and scalability challenges at all levels - from single processor, to moderately-scaled cluster, up to large HPC configurations with many Xeon Phi devices.

The keynote will present recent information about the KNL processor. The submitted talks cover optimization and scalability topics in real-world HPC applications, e.g. data layouts and code restructuring for efficient SIMD operation, work distribution and thread management. Aspects related to KNL features (e.g. high-bandwidth memory) are of particular interest. The usability of tools for development, debugging and performance analysis will be covered.

A planned panel session provides an opportunity to discuss optimization strategies for Phi and to provide feedback to the toolchain developers.

Important Note: The workshop is held in conjunction with the ISC'16 in Frankfurt (Main). For attending the workshop, you have to register for ISC Workshops (registration is open since March 1st). More information is on the ISC'16 Conference site

Call for Papers

Many-core and multicore processor technologies have put more demand on memory and interconnects. These processor technologies also require a more detailed understanding of the interaction between the memory, interconnect and SIMD units for users to design optimal algorithms at scale on a node, as well as at scale on a system level.

The IXPUG workshop is about sharing ideas, implementations, and experiences that will help users take advantage of new technologies such as AVX512 operations, high-bandwidth memory (HBM) and OmniPath. These architectural advances in Vectorization, Memory, and Communications on the Intel Xeon Phi platform will help boost adoption of many-core architecture in HPC as well as other computational spaces.

In the workshop you will experience an open forum with fellow application programmers, Intel Phi architecture designers, and compiler and tool experts.

In addition to the technical paper presentations, the program will include a morning keynote on Intel microprocessors and an afternoon presentation on memory performance, and will conclude with a panel discussion.

IXPUG welcomes paper submissions on innovative work from KNC and KNL users in academia, industry and government labs, describing original discoveries and experiences that will promote and prescribe efficient use of many-core and multicore systems.

Topics of interest are (but not limited to):

  • Vectorization: Data layout in cache for efficient SIMD operations, SIMD directives and operations, and 2-core tiling with 2D interconnected mesh
  • Memory: Data layout in memory for efficient access (data preconditioning), access latency concerns (prefetch, streams, costs for HBM), partitioning of DDR and HBM for applications (memory policies)
  • Communication, including early experiences with OmniPath
  • Thread and Process Management: Process and thread affinity issues, SMT (simultaneous multi-threading, in core), balancing processes and threads
  • Programming Modells: OpenMP 4.x, hStreams, using MPI 3 on Xeon Phi, hybrid programming (MPI/OpenMP, others)
  • Algorithms and Methods: , including scalable and vectorizable algorithms
  • Software Environments and Tools
  • Benchmarking & Profiling Tools
  • Visualization

Paper Submission

Paper Format: Papers will be published through the Springer Publishing, using their template (see Information for Authors below). It will also provide consistency for the reviewers in layout, page limit and font size.

  • PDF files only from LaTeX or MS Word docs.
  • Page limits: minimum 4, maximum 10 pages excluding reference list and acknowledgement.
  • Submission URL: EasyChair IXPUG Workshop ISC16.  Please select workshop, under topics.

The authors of the best scored papers are invite by the Program Committee for publication in the ISC’16 Workshop volume published within the Springer LNCS series.

Information for Authors

General information for authors can be found on Springers LNCS “Information for Authors of Computer Science Publications”. It includes links to MS Word templates. For LaTeX, a simplified LNCS template and Quick Start can be found in the LNCS Repository on GitHub.

Presentation Submission

A limited number of presentations will be accepted in the area of late-breaking technology  and subjects that may contain less content than a paper (for a lightning talk).
Make sure you check the "presentation" box in the submission form. 

The authors of the best scored papers are invite by the Program Committee for presentation in the ISC’16 Workshop.


Important Due Dates

 Extended Abstract Submission     15 April 2016 AoE
 Full Paper Submission--extended     29 April 3 May 2016 AoE
 Reviews start    4 May 2016
 Paper Acceptance Notification    13 May 2016 (postponed from 11 May 2016)
Presentation Agenda Finalized (presenter notifications)     27 May 2016
 Workshop Day     23 June 2016
 Camera Ready papers due     30 June 2016

Reviewers are expected to make judgment on what was available at the time reviews were assigned (May 3).  Subsequent updates to content may or may not be considered by the program committee as part of the selection decision.  We encourage authors to exercise the freedom to use the time up until presentation and camera ready copy to provide the highest-quality product. 

Review Process

All submitted papers will be reviewed. We apply a standard single-blind review process, i.e., the authors will be known to reviewers. All submissions within the scope of the workshop will be peer-reviewed and will need to demonstrate quality of the results, originality and new insights, technical strength, and correctness. The submitted papers may not be published in or be in preparation for other conferences, workshops or journals.  


    Session Presentation
09:00 09:15 Opening Richard Gerber (President of IXPUG, NERSC)
Welcome - The Intel Xeon Phi User's Group
09:15 10:00 Keynote I Avinash Sodani (Intel):
Knights Landing Intel® Xeon Phi™ CPU: Path to Parallelism with General Purpose Programming
10:00 10:30 Platform Evaluation A. Gómez-Iglesias, K. Milfeld, J. Cazes, C. Rosales-Fernandez, L. Koesterke and L. Huang (TACC):
A comparative study of application performance and scalability on the Intel Knights Landing processor
10:30 11:00 D. Doerfler, J. Deslippe, S. Williams, L. Oliker, B. Cook, Th. Kurth, M. Lobet, T. Malas, J.-L. Vay and H. Vincenti (NERSC):
Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor
11:00 11:30   Break
11:30 12:00 Applications I J. Deslippe, F. H. Da Jornada, D. Vigil-Fowler, K. Raman, R. Sasanka, St. G. Louie, N. Wichmann and T. Barnes (NERSC/UCB/Intel/Cray/NREL/LBNL):
Optimizing Excited-State Electronic-Structure Codes for Intel Knights Landing: a Case Study on the BerkeleyGW Software
12:00 12:30 Th. Kurth, D. Kalamkar, B. Joo and K. Vaidyanathan (NERSC/LBNL/JLAB/Intel):
Optimizing Dirac Wilson Operator and linear solvers for Intel KNL
12:30 13:00 B. Cook, P. Maris, M. Shao, N. Wichmann, M. Wagner, J. O’neill, T. Phung and G. Bansal (NERSC/ISU/LBNL/Cray/Intel):
High performance optimizations for nuclear physics code MFDn on KNL
13:00 14:00   Lunch
14:00 14:45 Keynote II John McCalpin (TACC):
Trends in System Cost and Performance Balances and Implications for Future HPC Systems
14:45 15:15 General Techniques O. Krzikalla, F. Wende and M. Höhnerbach (TUD/ZIB/RWTHA):
Dynamic SIMD Vector Lane Scheduling
15:15 15:45 C.J. Newburn, J. Sukha, I. Sharapov, A. D. Ngyuen and C.-C. Miao (Intel):
Suitability Assessment for Many-Core Targets
15:45 16:00 Lightning Talks I
N. Agrawal, P. Edwards, R. Ojha, A. Pandey and P. Pawar (TATA/Intel):
Performance Optimization of OpenFOAM on KNL
M. Noack (ZIB)
AVX512 vs AVX2 on KNL
16:00 16:30   Break
16:30 17:00 Applications II T. Malas, Th. Kurth and J. Deslippe (LBNL):
Optimization of the matrix-vector products of an IDR Krylov iterative solver for the Intel KNL manycore processor
17:00 17:30 A. Walden, S. Khan, B. Joo, D. Ranjan and M. Zubair (ODU/JLAB):
Optimizing a Multiple Right-hand Side Dslash Kernel for Intel Xeon Phi
17:30 17:50 Lightning Talks II R. Li, St. Gottlieb, C. Detar, D. Tousaint, A. Jha, D. Kalamkar, B. Joo and D. Doerfler (IU/Intel/UU/UA/JLAB/LBL):
Porting the MIMD Lattice Computation (MILC) Code to the Intel Xeon Phi Knights Landing Processor
M. Lysaght, S. Delaney and G. Civario (ICHEC/Tullow Oil): 
Early Performance of Seismic Imaging Kernels on Intel Knights Landing
M. Lobet, J.-L. Vay, H. Vincenti, A. Bhagatwala, R. Lehe and J. Deslippe (LBL):
PICSAR: a high-performance library for Particle-In-Cell codes optimized for Intel Xeon Phi KNL architectures 
K. Milfeld (TACC):
OpenMP Affinity.. ..On KNL
17:50 18:00 Closing Discussion and Final Remarks (IXPUG)


Program Committee

Damian Alvarez-Mallon, Forschungszentrum Jülich
Ryan Coleman, Sandia National Lab
Douglas Doerfler, NERSC/Berkeley Lab
Richard Gerber, NERSC/Berkeley Lab
Antonio Gomez, TACC
Simon Hammond,  Sandia National Lab
Rahul Hardikar, Indian Institute of Science
Helen He, NERSC/Berkeley Lab
Dave M Hiatt, big denominator
Michael Klemm, Intel Corp.
Lars Koesterke, TACC
Rakesh Krishnaiyer, Intel Corp.
Olli-Pekka Lehto, CSC - IT Center for Science Ltd.
John Linford, ParaTools, Inc.
Simon McIntosh-Smith, Bristol Univ.
John Michalakes, NREL
Kent Milfeld, TACC
Chris J Newburn, Intel Corp.
Dmitry Prohorov, Intel Corp.
Karthik Raman, Intel Corp.
Carlos Rosales, TACC
Hideki Saito, Intel Corp.
Abhinav Sarje, Berkeley Lab
Thomas Steinke, Zuse Institute Berlin
Estella Suarez, Forschungszentrum Jülich
Srinath Vadlamani, Paratools, Inc.
Jerome Vienne, TACC

Organizers and Contacts

Please contact one of the organizers if you have any questions.

  • Richard A. Gerber
    National Energy Research Scientific Computing Center, Lawrence Berkeley National Lab. (NERSC)
    This email address is being protected from spambots. You need JavaScript enabled to view it.
  • Kent Milfeld
    Texas Advanced Computing Center (TACC)
    This email address is being protected from spambots. You need JavaScript enabled to view it.
  • Chris J. Newburn
    Intel Corp.
    This email address is being protected from spambots. You need JavaScript enabled to view it.
  • Thomas Steinke
    Zuse Institute Berlin (ZIB)
    This email address is being protected from spambots. You need JavaScript enabled to view it.