Omni XcalableMP Compiler
Omni XcalableMP compiler is a source-to-source compiler that translates an XMP/C or XMP/Fortran code into a parallel code using an XcalableMP runtime library. The parallel code is compiled by the native compiler of the machine (e.g. Cray, PGI, Intel, gcc and so on). Omni Xcalable compiler supports most part of the latest XcalableMP specification. For more information of implementation status, please read docs/STATUS-XMP.md and docs/STATUS-CAF.md.
Performance
Benchmarks written in XcalableMP are available in HERE.
The K computer [Excel][AI]
Hardware :- CPU : SPARC64 VIIIfx 2.0 GHz, 8 Cores, 128 GFlops
- Memory : DDR3 SDRAM 16 GB, 64 GB/s
- Network : Torus fusion six-dimensional mesh/torus network, 5 GB/s x 10
- Omni XcalableMP Compiler : 0.9.0-alpha
- Compier : Fujitsu C/Fortran Compiler K-1.2.0-15
- Library : Fujitsu MPI K-1.2.0-15, Fujitsu SSLII K-1.2.0-15, FFTE-6.0
IBM BlueGene/Q in KEK [Excel][AI]
Hardware :- CPU : Power BQC 1.6 GHz, 16 Cores, 204.8 GFlops
- Memory : DDR3 SDRAM 16 GB, 42.6 GB/s
- Network : 5D Torus topology + external link each 2 GB/s send + 2 GB/s receive
- Omni XcalableMP Compiler : 0.9.0
- Compier : IBM XL C++ compiler, IBM XL Fortran compiler
- Library : GASNet-1.22.4, FFTE-6.0
HITACHI SR16000 model M1 in KEK [Excel]
Hardware :- CPU : POWER7 3.83GHz, 32 Cores, 980.48 GFlops
- Memory : DDR3 SuperNOVA buffered DIMM 256 GB, 512 GB/s
- Network : 96 GB/s, two-way communication
- Omni XcalableMP Compiler : 0.9.0
- Compier : IBM XL C++ Compiler, IBM XL Fortran Compiler
- Library : IBM ESSL for AIX, IBM Parallel ESSL for AIX, Netlib BLAS, FFTE-6.0
Benchmark | Perfomance on a single node |
---|---|
FFT | 19.45 GFlops |
HIMENO | 68.55 GFlops |
HPL | 406.88 GFlops |
STREAM | 249.36 GB/s |
HITACHI SR16000 model M1 (PLASMA SIMULATOR) in NIFS [Excel][AI]
Hardware :- CPU : POWER7 3.83GHz, 32 Cores, 980.48 GFlops
- Memory : DDR3 SuperNOVA buffered DIMM 128 GB, 512GB/s
- Network : 96 GB/s, two-way communication
- Omni XcalableMP Compiler : 0.9.0
- Compier : IBM XL C++ Compiler, IBM XL Fortran Compiler
- Library : IBM ESSL for AIX, IBM Parallel ESSL for AIX, Netlib BLAS, FFTE-6.0
Benchmark | Perfomance on a single node | Perfomance on two nodes |
---|---|---|
FFT | 5.49 GFlops | 11.13 GFlops |
HPL | 394.71 GFlops | 706.84 GFlops |
STREAM | 247.88 GB/s | 495.36 GB/s |