Working Groups

IXPUG Working Groups

The IXPUG working groups provide a virtual means to meet more regularly with other IXPUG members between yearly face-to-face meetings.  The working group was started to foster greater collaboration and knowledge-sharing related to topics of particular interest, and is a great way to get involved in the IXPUG community.

The goals of the working group are as follows:

  1. Direct IXPUG discussions to what is most relevant to the community.
  2. Disseminate results and techniques.
  3. Assist the community with performance debugging/troubleshooting.
  4. Provide a forum for collaboration between IXPUG members and Intel engineers.
  5. Help the community to prepare for upcoming IXPUG events.

How to Join

The working group is open to anybody that wishes to join.

Meetings are held on the second Thursday of every month at 08:00 AM PST, using GoToMeeting.

To receive updates and calendar reminders, subscribe to the working group mailing list HERE.  Please note that you must register for an account on the IXPUG website in order to subscribe.

Scheduling a Meeting

If you would be interested in attending/leading a discussion session focused on a particular topic, or would like to present your work to the group, please let us know by posting in the working group discussion forum or by contacting the working group organizer over This email address is being protected from spambots. You need JavaScript enabled to view it..


Upcoming Meetings

Date Title Author(s) Description
March 8, 2018 (8:00am PST) Compiler Prefetching for Knights Landing Rakesh Krishnaiyer, Intel Corporation

We will cover some of the recent changes in the compiler-based prefetching (for Knights Landing and Skylake) and provide tips on how to tune for performance using compiler prefetching options, pragmas and prefetch intrinsics.

May 10, 2018
(8:00 am PST)
High Productivity Languages Rollin Thomas, NERSC

Sergey Maidanov, Intel Corporation

This talk will cover challenges of numerical analysis and simulations at scale. The tools such as Python which are often used for prototyping are not designed to scale to large problems. As a result organizations have to have a dedicated team that takes a prototype created by research scientists and deploy it in the production environment.

The new approach is required for addressing both scalability and productivity aspects of applied science that combines two distinct worlds, the best of HPC world and the best of database worlds.

Starting with a brief overview of scalability aspects with respect to modern hardware architecture we will characterize what the problem at scale is, its inherit characteristics and how these map onto software design choices. We will also discuss selected experimental/observational science applications making use of Python at the National Energy Research Scientific Computing Center (NERSC), and what NERSC has done in partnership with the Intel Python Team to help application developers improve performance while retaining scientist/developer productivity.


Previous Meetings

Date Title Author(s) Description
January 11, 2018 Vectorization of Inclusive/Exclusive Compilier 19.0 Nikolay Panchenko, Intel Corporation

We propose a new OpenMP syntax to support inclusive and exclusive scan patterns.  In computer science, this pattern is also known as a prefix or cumulative sum.  The proposal defines several new constructs to support inclusive and exclusive scans through OpenMP, defines semantics for these constructs and possible combination of parallelization and vectorization.  In 18.0 Compiler 3 new OMP SIMD experimental features were added: vectorization of loops with breaks, syntax for compress/expand patterns and syntax for histogram pattern.

February 8, 2018
(8:00 am PST)

Threading Building Blocks (TBB) Flow Graph: Expressing and Analyzing Dependencies in Your C++ Application

See Archived webcast 

Pablo Reble, Intel Corporation

Developing for heterogeneous systems is challenging because applications may be composed of many layers of parallelism and employ a diverse set of programming models or libraries. This session focuses on Flow Graph, an extension to the Threading Building Blocks (TBB) interface that can be used as a coordination layer for heterogeneity that retains optimization opportunities and composes with existing models. This extension assists in expressing complex synchronization and communication patterns and in balancing load between CPUs, GPUs, and FPGAs.



Because a Flow Graph can express complex interactions, we use Intel Advisor’s Flow Graph Analyzer (FGA), which has been released as a Technology Preview in Parallel Studio XE 2018 to visualize interactions in a graph and map the application structure to performance data. Finally, we validate this approach by presenting use cases of applications using Flow Graph.



For more information about previous meetings, please refer to the minutes.