HPC Introduction
Installation of the current high performance computer system - named "foote", in honour of the American scientist Eunice Newton Foote - had been finalized at the institute in October 2024 after an EU-wide competitive bidding and selection process (Wettbewerblicher Dialog mit Teilnahmewettbewerb) conducted in 2022. Principal contractor is pro-com DATENSYSTEME GmbH with main components produced and delivered by Lenovo, NVIDIA and IBM. Direct water-cooling infrastructure for this system is provided by Waning Anlagenbau GmbH & Co. KG. This system is funded by the Land Brandenburg.
The HPC-Service of the institute is available in principal to all scientists of the institute and to external scientists affiliated with the institute through co-operation agreements. Registration with IT-Services is required prior to accessing the system.
HPC Highlights
- AMD EPYC 9554 processors with scalar frequencies of up to 3.75 GHz and 6 GByte DDR5 memory per core,
- NVIDIA H100 graphical co-processors for use and development of machine learning applications,
- IBM Storage Scale high performance parallel file system with 8 PByte capacity and up to 160 GBps read/write bandwidth,
- IBM Storage Protect tape storage with hierarchical storage management (HSM),
- NVIDIA non-blocking high-performance NDR Infiniband network,
- Direct water-cooled processors and memory with waste heat used to heat office building(s) during the winter season.
HPC Basic Metrics
HPC Hardware
- 240 Lenovo SD665 V3 direct water-cooled computer systems with a total of 31.744 AMD EPYC 9554 Genoa processor cores at a base clock of 3.1 GHz for batch processing and 768 GByte dual-ranked DDR5 memory at 4800MHz.
- 12 Lenovo SD665-N V3 direct water-cooled computer systems with a total of 1536 additional AMD EPYC 9554 Genoa processor cores at a base clock of 3.1 GHz, 1.5 TByte dual-ranked DDR5 memory at 4800MHz and a total of 48 NVIDIA H100 Hopper tensor core GPU.
- NVIDIA NDR InfiniBand high performance data network (400 Gbps per port).
- 8+ PByte file system capacity. Based on six IBM ESS-3500 systems.
- 2 IBM TS4500 tape libraries.
- 16 additional support servers
- 4 interactive terminal server computers
- 4 control server computers
- 4 backup server computers
- 2 file system control server computers
- 2 NFS / CIFS file system data export server computers.
HPC System Software
- Operating System: RHEL Linux Server
- Batch Queue and Resource Management: Slurm
- Cluster Administration: Confluent
- Parallel Filesystem: IBM Storage Scale
- Backup, Archiving and HSM: IBM Storage Protect
- Software Package Management: Environment Modules