Success Story: Running AVBP Industrial code on Arm architectures

Success story # Highlights:

  • Keywords:
    • performance portability
    • HPC on the cloud
    • arm HPC
  •  Industry sector: high performance computing simulation and engineering
  • Key codes used: AVBP
Normalized timing (AWS Graviton2 results as reference ) at the node-level for Explosion simulation using AVBP

Organisations & Codes Involved:

Cerfacs is a basic and applied research center located in Toulouse, France, specialized in modeling and numerical simulation. Through its facilities and expertise in High-Performance Computing, Cerfacs deals with major scientific and technical research problems of public and industrial interest.

arm

Arm Ltd. is a British semiconductor and software design company based in Cambridge, England. Its primary business is in the design of processors and software development tools.

scientific Challenge:

With the diversification of the micro-processor catalogue for High-Performance systems, porting and evaluating software performance on Arm-based architectures has become an imperative step for code developers. For core performance to multi-node scalability, real application benchmarks remain elusive. Given the myriad of Arm flavours available, a comprehensive real case benchmark would give developers and users a first look for the future usage of the European Processor Initiative (EPI) and Arm-based leadership class systems.

Solution:

In collaboration with Arm Ltd., CERFACS has performed a first benchmark using the AVBP code, a state-of-the-art Navier Stokes solver on unstructured grids for reactive compressible flows written in Fortran and based on MPI for parallelism.

The dataset used for testing is the EXCELLERAT explosion use case “MASRI”, simulating the explosion of a propane mixture in a confined domain. The benchmark compared state of the art Arm implementations GRAVITON (from Amazon Web Services [AWS]) and AMPERE (from Arm Ltd) with Intel Ice Lake and AMD Epyc Rome processors for intra-node core performance and inter-node efficiency. Strong scaling up to 2,048 cores was measured.

Scientific impact of this result:

Currently the High-Performance Computing landscape is dominated by x86 architectures on the CPU side (INTEL or AMD) and NVIDIA on the GPU side. Some fringe examples like Fujitsu’s A64FX architecture have shown the extreme potential of arm architectures allying performance and energy efficiency. The European project initiative aims toward producing a home-grown implementation of a processor that could compete with these giants, arm is one of the flavors being investigated.

With this work, we demonstrate the portability and parallel efficiency of the AVBP code on arm-based architectures GRAVITON 2 from AWS and AMPERE from arm as well as their competitiveness with current state of the art Intel (ICE LAKE) and AMD (Epyc Rome) processors. Furthermore, this work was performed using AWS services showcasing the viability of this platform for High-Performance Computing.

These results show that arm-based architectures can favourably compete with standard processors at the node level, and offer on-par strong scaling performance up to 32 compute nodes.

Benefits for further research:

  • Portability of the scientific code
  • Transparent compute workflow independent of the architecture for the user
  • HPC as a service viability for industrial users

PRODUCTS/SERVICES:

  • Return of experience on portability and optimisation on arm-based systems
  • Return of experience on HPC cloud services from AWS
  • Cost/performance analysis of HPC on AWS systems

UNIQUE VALUE OF EACH SERVICE:

This work offers a unique first glance on the performance of the state-of-the-art code in this new architecture including cloud based HPC service. It shows the viability of both the architecture and the cloud system but also showcases the readiness of the AVBP solver to tackle these systems when or if available for its users.

multi-node normalized timing
Multi-node normalized timing (lower is better, AWS Graviton 2 as reference ) up to 32 AWS EC2 instances on the explosion simulation

If you have any questions related to this success story, please register on our Service Portal and send a request on the “my projects” page.