IXPUG Workshop
"Application Performance on Intel Xeon Phi –
Being Prepared for KNL and Beyond"

Location: Frankfurt am Main, Germany
Date: Thursday, June 23, 2016, 8:30am-6:00pm
Venue: Marriott Frankfurt Hotel

The workshop will bring together software developers and technology experts to share challenges, experiences and best-practice methods for the optimization of HPC workloads on the Intel Xeon Phi. The workshop will cover application performance and scalability challenges at all levels - from single processor, to moderately-scaled cluster, up to large HPC configurations with many Xeon Phi devices.

The keynote will present recent information about the KNL processor. The submitted talks cover optimization and scalability topics in real-world HPC applications, e.g. data layouts and code restructuring for efficient SIMD operation, work distribution and thread management. Aspects related to KNL features (e.g. high-bandwidth memory) are of particular interest. The usability of tools for development, debugging and performance analysis will be covered.

A planned panel session provides an opportunity to discuss optimization strategies for Phi and to provide feedback to the toolchain developers.

Important Note: The workshop is held in conjunction with the ISC'16 in Frankfurt (Main). For attending the workshop, you have to register for ISC Workshops (registration is open since March 1st). More information is on the ISC'16 Conference site.

Program

		Session	Presentation
09:00	09:15	Opening	Richard Gerber (President of IXPUG, NERSC) Welcome - The Intel Xeon Phi User's Group
09:15	10:00	Keynote I	Avinash Sodani (Intel): Knights Landing Intel® Xeon Phi™ CPU: Path to Parallelism with General Purpose Programming
10:00	10:30	Platform Evaluation	A. Gómez-Iglesias, K. Milfeld, J. Cazes, C. Rosales-Fernandez, L. Koesterke and L. Huang (TACC): A comparative study of application performance and scalability on the Intel Knights Landing processor
10:30	11:00	Platform Evaluation	D. Doerfler, J. Deslippe, S. Williams, L. Oliker, B. Cook, Th. Kurth, M. Lobet, T. Malas, J.-L. Vay and H. Vincenti (NERSC): Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor
11:00	11:30		Break
11:30	12:00	Applications I	J. Deslippe, F. H. Da Jornada, D. Vigil-Fowler, K. Raman, R. Sasanka, St. G. Louie, N. Wichmann and T. Barnes (NERSC/UCB/Intel/Cray/NREL/LBNL): Optimizing Excited-State Electronic-Structure Codes for Intel Knights Landing: a Case Study on the BerkeleyGW Software
12:00	12:30		Th. Kurth, D. Kalamkar, B. Joo and K. Vaidyanathan (NERSC/LBNL/JLAB/Intel): Optimizing Dirac Wilson Operator and linear solvers for Intel KNL
12:30	13:00		B. Cook, P. Maris, M. Shao, N. Wichmann, M. Wagner, J. O’neill, T. Phung and G. Bansal (NERSC/ISU/LBNL/Cray/Intel): High performance optimizations for nuclear physics code MFDn on KNL
13:00	14:00		Lunch
14:00	14:45	Keynote II	John McCalpin (TACC): Trends in System Cost and Performance Balances and Implications for Future HPC Systems
14:45	15:15	General Techniques	O. Krzikalla, F. Wende and M. Höhnerbach (TUD/ZIB/RWTHA): Dynamic SIMD Vector Lane Scheduling
15:15	15:45	General Techniques	C.J. Newburn, J. Sukha, I. Sharapov, A. D. Ngyuen and C.-C. Miao (Intel): Suitability Assessment for Many-Core Targets
15:45	16:00	Lightning Talks I	N. Agrawal, P. Edwards, R. Ojha, A. Pandey and P. Pawar (TATA/Intel): Performance Optimization of OpenFOAM on KNL
15:45	16:00	Lightning Talks I	M. Noack (ZIB) AVX512 vs AVX2 on KNL
16:00	16:30		Break
16:30	17:00	Applications II	T. Malas, Th. Kurth and J. Deslippe (LBNL): Optimization of the matrix-vector products of an IDR Krylov iterative solver for the Intel KNL manycore processor
17:00	17:30	Applications II	A. Walden, S. Khan, B. Joo, D. Ranjan and M. Zubair (ODU/JLAB): Optimizing a Multiple Right-hand Side Dslash Kernel for Intel Xeon Phi
17:30	17:50	Lightning Talks II	R. Li, St. Gottlieb, C. Detar, D. Tousaint, A. Jha, D. Kalamkar, B. Joo and D. Doerfler (IU/Intel/UU/UA/JLAB/LBL): Porting the MIMD Lattice Computation (MILC) Code to the Intel Xeon Phi Knights Landing Processor
			M. Lysaght, S. Delaney and G. Civario (ICHEC/Tullow Oil): Early Performance of Seismic Imaging Kernels on Intel Knights Landing
			M. Lobet, J.-L. Vay, H. Vincenti, A. Bhagatwala, R. Lehe and J. Deslippe (LBL): PICSAR: a high-performance library for Particle-In-Cell codes optimized for Intel Xeon Phi KNL architectures
			K. Milfeld (TACC): OpenMP Affinity.. ..On KNL
17:50	18:00	Closing	Discussion and Final Remarks (IXPUG)

Program Committee

Damian Alvarez-Mallon, Forschungszentrum Jülich
Ryan Coleman, Sandia National Lab

Douglas Doerfler, NERSC/Berkeley Lab
Richard Gerber, NERSC/Berkeley Lab
Antonio Gomez, TACC
Simon Hammond, Sandia National Lab
Rahul Hardikar, Indian Institute of Science
Helen He, NERSC/Berkeley Lab
Dave M Hiatt, big denominator
Michael Klemm, Intel Corp.

Lars Koesterke, TACC
Rakesh Krishnaiyer, Intel Corp.
Olli-Pekka Lehto, CSC - IT Center for Science Ltd.
John Linford, ParaTools, Inc.
Simon McIntosh-Smith, Bristol Univ.
John Michalakes, NREL
Kent Milfeld, TACC
Chris J Newburn, Intel Corp.
Dmitry Prohorov, Intel Corp.
Karthik Raman, Intel Corp.
Carlos Rosales, TACC
Hideki Saito, Intel Corp.
Abhinav Sarje, Berkeley Lab
Thomas Steinke, Zuse Institute Berlin
Estella Suarez, Forschungszentrum Jülich
Srinath Vadlamani, Paratools, Inc.
Jerome Vienne, TACC

Organizers and Contacts

Please contact one of the organizers if you have any questions.

Richard A. Gerber
National Energy Research Scientific Computing Center, Lawrence Berkeley National Lab. (NERSC)
This email address is being protected from spambots. You need JavaScript enabled to view it.
Kent Milfeld
Texas Advanced Computing Center (TACC)
This email address is being protected from spambots. You need JavaScript enabled to view it.
Chris J. Newburn
Intel Corp.
This email address is being protected from spambots. You need JavaScript enabled to view it.
Thomas Steinke
Zuse Institute Berlin (ZIB)
This email address is being protected from spambots. You need JavaScript enabled to view it.

Call for Papers

Many-core and multicore processor technologies have put more demand on memory and interconnects. These processor technologies also require a more detailed understanding of the interaction between the memory, interconnect and SIMD units for users to design optimal algorithms at scale on a node, as well as at scale on a system level.

The IXPUG workshop is about sharing ideas, implementations, and experiences that will help users take advantage of new technologies such as AVX512 operations, high-bandwidth memory (HBM) and OmniPath. These architectural advances in Vectorization, Memory, and Communications on the Intel Xeon Phi platform will help boost adoption of many-core architecture in HPC as well as other computational spaces.

In the workshop you will experience an open forum with fellow application programmers, Intel Phi architecture designers, and compiler and tool experts.

In addition to the technical paper presentations, the program will include a morning keynote on Intel microprocessors and an afternoon presentation on memory performance, and will conclude with a panel discussion.

IXPUG welcomes paper submissions on innovative work from KNC and KNL users in academia, industry and government labs, describing original discoveries and experiences that will promote and prescribe efficient use of many-core and multicore systems.

Topics of interest are (but not limited to):

Vectorization: Data layout in cache for efficient SIMD operations, SIMD directives and operations, and 2-core tiling with 2D interconnected mesh
Memory: Data layout in memory for efficient access (data preconditioning), access latency concerns (prefetch, streams, costs for HBM), partitioning of DDR and HBM for applications (memory policies)
Communication, including early experiences with OmniPath
Thread and Process Management: Process and thread affinity issues, SMT (simultaneous multi-threading, in core), balancing processes and threads
Programming Modells: OpenMP 4.x, hStreams, using MPI 3 on Xeon Phi, hybrid programming (MPI/OpenMP, others)
Algorithms and Methods: , including scalable and vectorizable algorithms
Software Environments and Tools
Benchmarking & Profiling Tools
Visualization

Paper Submission

Paper Format: Papers will be published through the Springer Publishing, using their template (see Information for Authors below). It will also provide consistency for the reviewers in layout, page limit and font size.

PDF files only from LaTeX or MS Word docs.
Page limits: minimum 4, maximum 10 pages excluding reference list and acknowledgement.
Submission URL: EasyChair IXPUG Workshop ISC16. Please select workshop, under topics.

The authors of the best scored papers are invite by the Program Committee for publication in the ISC’16 Workshop volume published within the Springer LNCS series.

Information for Authors

General information for authors can be found on Springers LNCS “Information for Authors of Computer Science Publications”. It includes links to MS Word templates. For LaTeX, a simplified LNCS template and Quick Start can be found in the LNCS Repository on GitHub.

Presentation Submission

A limited number of presentations will be accepted in the area of late-breaking technology and subjects that may contain less content than a paper (for a lightning talk).
Make sure you check the "presentation" box in the submission form.

Presentation Format: IXPUG PowerPoint template (with suggestions; specific format not required)
Page limits: 10 pages for lightning talks; 20 pages for late-breaking talks
Submission URL: EasyChair IXPUG Workshop ISC16

The authors of the best scored papers are invite by the Program Committee for presentation in the ISC’16 Workshop.

Important Due Dates

Extended Abstract Submission		15 April 2016 AoE
Full Paper Submission--extended		29 April 3 May 2016 AoE
Reviews start		4 May 2016
Paper Acceptance Notification		13 May 2016 (postponed from 11 May 2016)
Presentation Agenda Finalized (presenter notifications)		27 May 2016
Workshop Day		23 June 2016
Camera Ready papers due		30 June 2016

Reviewers are expected to make judgment on what was available at the time reviews were assigned (May 3). Subsequent updates to content may or may not be considered by the program committee as part of the selection decision. We encourage authors to exercise the freedom to use the time up until presentation and camera ready copy to provide the highest-quality product.

Review Process

All submitted papers will be reviewed. We apply a standard single-blind review process, i.e., the authors will be known to reviewers. All submissions within the scope of the workshop will be peer-reviewed and will need to demonstrate quality of the results, originality and new insights, technical strength, and correctness. The submitted papers may not be published in or be in preparation for other conferences, workshops or journals.

IXPUG ISC 2016 Workshop

IXPUG Workshop"Application Performance on Intel Xeon Phi –Being Prepared for KNL and Beyond"

Location: Frankfurt am Main, GermanyDate: Thursday, June 23, 2016, 8:30am-6:00pmVenue: Marriott Frankfurt Hotel