HPC Service Overview (2024)

The foundation of scientific modelling, numerical experiments, machine learning and scientific ‘big’ data applications.

HPC Introduction

Installation of the current high performance computer system - named "foote", in honour of the American scientist Eunice Newton Foote - had been finalized at the institute in October 2024 after an EU-wide competitive bidding and selection process (Wettbewerblicher Dialog mit Teilnahmewettbewerb) conducted in 2022. Principal contractor is pro-com DATENSYSTEME GmbH  with main components produced and delivered by Lenovo, NVIDIA and IBM. Direct water-cooling infrastructure for this system is provided by Waning Anlagenbau GmbH & Co. KG.  This system is funded by the Land Brandenburg.

The HPC-Service  of the institute is available in principal to all scientists of the institute and to external scientists affiliated with the institute through co-operation agreements.  Registration with IT-Services is required prior to accessing the system.

HPC Highlights

  • AMD EPYC 9554 processors with scalar frequencies of up to 3.75 GHz and 6 GByte DDR5 memory per core,
  • NVIDIA H100 graphical co-processors for use and development of  machine learning applications,
  • IBM Storage Scale  high performance parallel file system with 8 PByte capacity and up to 160 GBps read/write bandwidth,
  • IBM Storage Protect tape storage with hierarchical storage management (HSM),
  • NVIDIA non-blocking high-performance NDR Infiniband network,
  • Direct water-cooled processors and memory with waste heat used to heat office building(s) during the winter season.

HPC Basic Metrics

HPC Hardware

  • 240 Lenovo SD665 V3 direct water-cooled computer systems with a total of 31.744 AMD EPYC 9554 Genoa processor cores at a base clock of 3.1 GHz  for batch processing and 768 GByte dual-ranked DDR5 memory at 4800MHz.

  • 12 Lenovo SD665-N V3 direct water-cooled computer systems with a total of  1536 additional AMD EPYC 9554 Genoa processor cores at a base clock of 3.1 GHz, 1.5 TByte dual-ranked DDR5 memory at 4800MHz and a total of 48 NVIDIA H100 Hopper tensor core GPU.

  • NVIDIA NDR InfiniBand high performance data network (400 Gbps per port).

  • 8+ PByte file system capacity.  Based on six IBM ESS-3500 systems.

  • 2 IBM TS4500 tape libraries.
  • 16 additional support servers
    • 4 interactive terminal server computers
    • 4 control server computers
    • 4 backup server computers
    • 2 file system control server computers
    • 2 NFS / CIFS file system data export server computers.

HPC System Software

  • Operating System: RHEL Linux Server
  • Batch Queue and Resource Management: Slurm
  • Cluster Administration: Confluent
  • Parallel Filesystem: IBM Storage Scale
  • Backup, Archiving and HSM: IBM Storage Protect
  • Software Package Management: Environment Modules