Sans titre :: Documentation technique (Beta version)

Compilers and interpreters

Compilation consists of translating source code written in a language understandable by humans into a language understandable by the processors.
There are many programming languages, we have installed software related to the languages used by our users. If necessary, we can add others.

1. Intel Software

Intel delivers a very complete software package which includes compilers (Fortran, C, C++), a MPI implementation, other libraries for parallel computing, and efficient profiling tools. This is the oneAPI package. We install several versions each year. These different software programs are accessible through modules, whose name is intel/<product>/20xx.version.number.

1.1. Intel Fortran Compiler

Since the oneAPI bundle, the Fortran compiler is called ifx. The previous version, ifort, is still supported and in some cases (like complex numbers, avx512 usage, etc.) it may deliver better performances than ifx.
Here are suggested option sequences (most are compatible with C/C++ compilers)

debugging: -O0 -g -traceback -check all -warn all -fpe0 -ftz -ftrapuv -debug all -fp-stack-check -init=arrays,snan
optimization: <ARCH> [-O2|-O3] -g -traceback -implicitnone -ftz -opt-prefetch -unroll-aggressive
profiling (GNU gprof): optimization + -pg
analysis: optimization + -qopt-report=<level> -qopt-report-phase=<step>
vectorization: optimization (at least -O2) + -qopt-zmm-usage=high (only for -xCORE-AVX512)
vectorization/analysis: optimization + -qopt-report=5 -qopt-report-phase=vec
parallelization by OpenMP: optimization + -qopenmp

<ARCH> designates options to adapt the executable program to the characteristics of the processor on which it will run: AVX, AVX2, etc. The GLiCID clusters use servers from different ranges, from several origins, and therefore with different characteristics.
ARCH can take different values like: -xCORE-AVX2, -xCORE-AVX512. The -xHost option optimizes for the processor on which the compilation takes place.

Serveur GLiCID

Processor

Option

cnode

AMD EPYC Genoa 9474F, 3,3GHz

-xCORE-AVX2

Cloudbreak

AMD EPYC Rome 7282, 2,8GHz

-xCORE-AVX2

Cribbar

Intel Xeon Silver 4114, 2,2 GHz, Skylake

-xCORE-AVX2, -xCORE-AVX512

Budbud

Intel Xeon E5 2640, 2,4 GHz, Broadwell

-AVX2

Nazare (and Jaws)

Intel Xeon ES 2630, 2,66 GHz, Broadwell

-AVX2

Be careful, there is backward compatibility but not upward compatibility!

1.2. Intel C Compiler

The C compiler is called icx (the previous name was icc). Most of Intel’s Fortran compiler’s options work with icc and icx.

1.3. Intel C++ Compiler

The C++ compiler is called icpx (the previous name was icpc). Most of Intel’s Fortran compiler’s options work with icpc and icpx.

2. GNU Software

GNU compilers are accessible through modules whose name is of the form gcc/x.y.z, where x.y.z is the version number.

2.1. GNU Fortran Compiler

The Fortran compiler is called gfortran.
Here are suggested sequences of options

debugging: -O0 -g -fbacktrace -fimplicit-none -fcheck=all -ffpe-trap=invalid,zero,overflow,underflow -Wall
optimization: <ARCH> -O3 -g -fbacktrace -fprefetch-loop-arrays
profiling (GNU gprof) /analysis: optimization + -p
vectorization: optimization (at least -O2) + -mavx (for AVX) or -mavx2 (for AVX2) -mavx512f -mavx512dq -mavx512bw -mavx512vbmi -mavx512vbmi2 -mavx512vl (for AVX512)
vectorization/analysis: optimization + -fopt-info-vec-all
parallelization by OpenMP: optimization + -fopenmp

<ARCH> designates options to adapt the executable program to the characteristics of the processor on which it will run: AVX, AVX2, etc. The GLiCID clusters use servers from different ranges, from several origins, and therefore with different characteristics.
ARCH is broken down into two options -march=<arg> and -mtune=<arg> with arg which can take different values like: znver2, core-avx2 . The native argument optimizes for the processor on which the compilation takes place.

GLiCID Server

Processor

Option

cnode

AMD EPYC Genoa 9474F, 3,3GHz

-march=znver4 -mtune=znver4 -mfma -mavx512

Cloudbreak

AMD EPYC 7282, 2.8GHz

-march=znver2 -mtune=znver2 -mfma -mavx2

Cribbar

Intel Xeon Silver 4114, 2.2 GHz, Skylake

-march=skylake -mtune=skylake

Budbud

Intel Xeon E5 2640, 2.4 GHz, Broadwell

-march=cascadelake -mtune=cascadelake

Nazare (and Jaws)

Intel Xeon ES 2630, 2.66 GHz, Broadwell

-march=core-avx2 -mtune=core-avx2

2.2. GNU C Compiler

The C compiler is called gcc.

2.3. GNU C++ Compiler

The C++ compiler is called g++.

3. NVIDIA software

NVIDIA programs and libraries (e.g. CUDA) are accessible through modules whose names are of the form nvhpc/xx.y, where xx.y is the version number.

3.1. NVIDIA (ex-PGI) Fortran Compiler

The compiler is now called nvfortran (previous name pgf90).
Here are suggested sequences of options

debugging: -O0 -g -traceback -Minfo=all -Mchkfpstk -Mc hkstk -Mdalign -Mdclchk -Mdepchk -Miomutex -Mrecursive -Msave -Ktrap=fp
optimization: <ARCH> -O3 -g -traceback -Mvect -Mcache_align -Mprefetch -Munroll
profiling/analysis: PGI provides its own profiling tool, pgprof
parallelization by OpenMP: optimization + -mp

<ARCH> designates an option to adapt the executable program to the characteristics of the processor on which it will run: AVX, AVX2, etc. The GLiCID clusters use servers from different ranges, from several origins, and therefore with different characteristics.
ARCH is written in the form -tp=<arg> with arg which can take different values like: nehalem-64, ivybridge-64, -skylake-64. The native argument optimizes for the processor on which the compilation takes place.

CCIPL Server

Processor

Option

Cloudbreak

AMD EPYC 7282, 2.8GHz

-tp zen

Cribbar

Intel Xeon Silver 4114, 2.2 GHz, Skylake

-tp=skylake-64

Budbud

Intel Xeon E5 2640, 2.4 GHz, Broadwell

-tp=broadwell-64

Nazare (and Jaws)

Intel Xeon ES 2630, 2.66 GHz, Broadwell

-tp=broadwell-64

Chezine

Intel Xeon 5650, 2.66 GHz, Westmere EP

-tp=nehalem-64

3.2. NVIDIA (PGI) C Compiler

The NVIDIA C compiler is now called nvc (the previous name was pgcc).

3.3. NVIDIA (PGI) C++ Compiler

The NVIDIA C++ compiler is now called nvc++ (the previous name was pgc++).

3.4. NIVIDIA CUDA Compiler

The NVIDIA CUDA compiler is called nvcc.

4. Python Software

Python3 is installed on the cluster, Python2 is no more supported and should be avoided. It’s accessible through modules whose name is of the form python/x.y.z, where x.y.z denotes its version.
Some commonly used (like numpy, scipy) additional libraries are installed in the distribution tree. For specific usage, it is recommanded to use guix installation for adding particular versions of packages.

5. R Software

R is installed on the cluster, and two versions of additional packages have been installed: compiled or not compiled with the Intel MKL scientific library.
R is accessible through module, of the form R-project/x.y.z_gnu_mkl, where x.y.z designates the version of R.

6. Julia Software

Julia is installed on the cluster and is accessible through module, of the form julia/x.y.z, where x.y.z designates its version.

7. Pre-compiled binary programs and guix

When possible, it is therefore recommended to recompile the software (or to use its Guix version, which amounts to the same thing) rather than using binary versions of which we know nothing.

In the context of Guix, it is possible to run such a binary natively if the library names are compatible:

This procedure is at your own risk and may malfunction spectacularly: segfault or otherwise. There is no guarantee of proper functioning.

it is necessary to install in the execution context (the general user profile, or a dedicated profile with guix shell) the necessary libraries: for example, to run software that requires libstdc++ and libz, it is necessary to install these libraries AND include them in search paths:

guix install gcc:lib libzip (1)
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$LIBRARY_PATH (2)
./mon-logiciel

1	to do only once
2	to do before each launch of the software

Or to better compartmentalize:

guix shell -CF bash coreutils gcc:lib libzip

In more complicated cases: - It is also possible to use podman or apptainer and run the binary in the execution context corresponding to what is expected.