Bild von Institut mit Institutslogo
homeicon uni sucheicon suche siteicon sitemap kontakticon kontakt
unilogo Universität Stuttgart
Institut für Wasserbau - IWS

BW-Grid - Lehrstuhl für Hydromechanik und Hydrosystemmodellierung


General Information about the LH2-part of the BW-Grid System

Information about the usage can be found on the Internal Wiki Pages of the LH2.

The LH2 part of the BW-Grid Cluster has the following properties:
  • 70 compute-nodes, each nodes has two sockets (in total 560 cores)
  • Infiniband Interconnect
  • 2.8 GHz XEON Harpertown processores with 4 cores
  • 16 GB Memory for each compute node, 1333 MHz FSB
  • 96 ports Infiniband Switch non blocking
  • Frontend node with two 2 harpertown processores, 24 GB Memory, 1333 MHz FSB
  • all build in three racks with blade centres by IBM.
  • Global Disk 100 TB
  • Operating System: ScientificLinux? 5.0 on Intel based nodes
  • Operating System: Fedora Core8 on cell based nodes
  • Batchsystem: Torque/Maui
  • GPFS
  • OpenMPI
  • Compiler: Intel, GCC, Java

For general information on the BW-Grid see the HLRS-wicki pages:

In Figure 1 one can see a sketch of the BW-Grid at the HLRS (height of the system is 1,86m). Three rightmost racks belong to the LH2 (frontend with bladecentres).

Figure 2 shows an overview sketch of the system.

Figure 3 shows rear-view on a single rack.

Figure 4 shows the front view on a bladecentre. Each bladecentre holds 14 blades, each blade has two processores, each processor 4 cores (quadcore).

Figure 5 shows the speedup of the cluster with increasing number of cores and increasing problem size (MUFTE-UG 2p2cni CO2-Storage example is used).
Note that the scaling behaviour is also software/problem (MUFTE-UG, CO2-Storage) dependent.
One can see the theoretical speedup, i.e. on x cores the job runs x times faster (compared to a single core job), differs from the real speedup, e.g. on 64 cores the job runs (only) 53 times faster (for 0.246 million unknowns). With increasing problem size the speedup increases for the same number of cores used. The numer of nodes is the number of cores divided by 8 (all cores are used used on each allocated node). For the theory on speedup calculations see Wikipedia.

Figure 6 shows the efficiency of parallel computation.
For the theory on efficiency calculations see Wikipedia.

Figure 7 shows the total cpu time used.
The above statistics are based on this numbers. Note that the problem having 0.63 million unknowns, needs at least 4 cores to run (due to memory constraints). Therefore the speedup and the efficiency for a job running on 4 cores with 0.63 million unknowns was assumed to be equal to the speedup and the efficiency for problem size of 0.246 million unknowns!

For further questions please contact Michelle Hartnick or Andreas Kopp.