HPC Services Overview

Numerical experiments, machine learning and scientific ‘big’ data applications are the foundations of research at the institute. And the most important tool for mastering all three tasks is the high-performance computer (HPC) system. The current hpc2024 system represents the eighth generation of this type of machines installed and operated at the institute.

HPC Introduction

Installation of the current high performance computer system - named "foote", in honour of the American scientist Eunice Newton Foote - had been finalized at the institute in October 2024 after an EU-wide competitive bidding and selection process (Wettbewerblicher Dialog mit Teilnahmewettbewerb) conducted in 2022. Principal contractor is pro-com DATENSYSTEME GmbH  with main components produced and delivered by Lenovo, NVIDIA and IBM. Direct water-cooling infrastructure for this system is provided by Waning Anlagenbau GmbH & Co. KG.  This system is funded by the Land Brandenburg.

The HPC-Service  of the institute is available in principal to all scientists of the institute and to external scientists affiliated with the institute through co-operation agreements.  Registration with IT-Services is required prior to accessing the system.

HPC Highlights

  • AMD EPYC 9554 processors with scalar frequencies of up to 3.75 GHz and 6 GByte DDR5 memory per core,
  • NVIDIA H100 graphical co-processors for use and development of  machine learning applications,
  • IBM Storage Scale  high performance parallel file system with 8 PByte capacity and up to 160 GBps read/write bandwidth,
  • IBM Storage Protect tape storage with hierarchical storage management (HSM),
  • NVIDIA non-blocking high-performance NDR Infiniband network,
  • Direct water-cooled processors and memory with waste heat used to heat office building(s) during the winter season.

HPC Basic Metrics

HPC Hardware

  • 240 Lenovo SD665 V3 direct water-cooled computer systems with a total of 31.744 AMD EPYC 9554 Genoa processor cores at a base clock of 3.1 GHz  for batch processing and 768 GByte dual-ranked DDR5 memory at 4800MHz.

  • 12 Lenovo SD665-N V3 direct water-cooled computer systems with a total of  1536 additional AMD EPYC 9554 Genoa processor cores at a base clock of 3.1 GHz, 1.5 TByte dual-ranked DDR5 memory at 4800MHz and a total of 48 NVIDIA H100 Hopper tensor core GPU.

  • NVIDIA NDR InfiniBand high performance data network (400 Gbps per port).

  • 8+ PByte file system capacity.  Based on six IBM ESS-3500 systems.

  • 2 IBM TS4500 tape libraries.
  • 16 additional support servers
    • 4 interactive terminal server computers
    • 4 control server computers
    • 4 backup server computers
    • 2 file system control server computers
    • 2 NFS / CIFS file system data export server computers.

HPC System Software

  • Operating System: RHEL Linux Server
  • Batch Queue and Resource Management: Slurm
  • Cluster Administration: Confluent
  • Parallel Filesystem: IBM Storage Scale
  • Backup, Archiving and HSM: IBM Storage Protect
  • Software Package Management: Environment Modules

Close menu