1. Submission Script Examples
1.1. MPI small example
1.1.1. Fortran
1.1.1.1. Compilation with Intel compiler / MPI library
module load intel/compiler/2023.1.0 intel/mpi/2021.9.0 mpiifort -fc=ifx -O2 -traceback main_mpi.f90 -o a_impi.out
1.1.1.2. Compilation with GNU / OpenMPI library
module load gcc/13.1.0 openmpi/ucx/4.1.5_gcc_8.5.0_ucx_1.14.1_rdma_46.0 mpif90 -O2 -g -fbacktrace main_mpi.f90 -o a_ompi.out
1.1.1.3. Fortran source file main_mpi.f90 :
PROGRAM main_mpi USE mpi IMPLICIT NONE INTEGER :: ier INTEGER :: iMPI_Rank INTEGER :: iMPI_Size ! INTEGER :: ih, il INTEGER :: iver, isubver ! CHARACTER(MPI_MAX_PROCESSOR_NAME) :: host_name CHARACTER(MPI_MAX_LIBRARY_VERSION_STRING) :: lib_version ! call MPI_Init (ier) call MPI_Comm_Size (MPI_COMM_WORLD, iMPI_Size, ier) call MPI_Comm_Rank (MPI_COMM_WORLD, iMPI_Rank, ier) ! !Connaitre le nom de l'hote CALL MPI_Get_Processor_Name (host_name, ih, ier) ! !Connaitre la version de la version de la librairie MPI CALL MPI_Get_Library_Version (lib_version, il, ier) ! IF (iMPI_Rank == 0) WRITE (6,'(A)') lib_version(1:il) ! !Connaitre la version de MPI CALL MPI_Get_Version (iver, isubver, ier) ! WRITE (6,'(2(I4,1x),A,2x,I1,A1,I1)') iMPI_Rank, iMPI_Size, host_name(1:ih), iver, '.', isubver CALL MPI_Barrier (MPI_COMM_WORLD, ier) IF (IMPI_Rank == 0) THEN WRITE (6,'(A)') '=====================================' CALL Flush (6) END IF ! call MPI_Finalize (ier) ! STOP END PROGRAM main_mpi
1.1.2. C
1.1.2.1. Compilation with Intel compiler / MPI library
module load intel/compiler/2023.1.0 intel/mpi/2021.9.0 mpiicc -cc=icx -O2 -traceback main_mpi.c -o a_impi.out
1.1.2.2. Compilation with GNU / OpenMPI library
module load gcc/13.1.0 openmpi/ucx/4.1.5_gcc_8.5.0_ucx_1.14.1_rdma_46.0 mpicc -O2 -g main_mpi.c -o a_ompi.out
1.1.2.3. C source file main_mpi.c
#include <mpi.h> #include <stdio.h> int main(int argc, char** argv) { // Initialize the MPI environment MPI_Init(NULL, NULL); // Get the number of processes int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); // Get the rank of the process int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); // Get the name of the processor char processor_name[MPI_MAX_PROCESSOR_NAME]; int name_len; MPI_Get_processor_name(processor_name, &name_len); // Print off a hello world message printf("Hello world from processor %s, rank %d out of %d processors\n", processor_name, world_rank, world_size); // Finalize the MPI environment. MPI_Finalize(); }
1.1.3. Intel MPI Multi-Node MPI Parallel Script
For example here to launch a parallel code on 96 processes which therefore uses 4 cnode nodes.
#!/bin/bash # # Name of the job #SBATCH -J my_mpi_job # # Number of nodes #SBATCH --nodes=4 # # Number of MPI processes per node #SBATCH --ntasks-per-node=24 # # Memory per CPU #$SBATCH --mem-per-cpu=10g # # Priority #SBATCH --qos=short # # No sharing of the nodes #SBATCH --exclusive # # Wall clock limit #SBATCH --time=01:00:00 # # Mail address of the user #SBATCH --mail-user=my.mail@domain.fr # # Event notification #SBATCH --mail-type=all # # Standard output file #SBATCH -o job_mpi-%j.out module purge module load intel/compiler/2023.1.0 module load intel/mpi/2021.9.0 # # To go to the running directory cd $SLURM_SUBMIT_DIR # # So that slurm and Intel MPI work together export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so time srun ./my_mpi_app > ./output_`printf %03d ${SLURM_NTASKS}`-${SLURM_JOB_ID}.txt 2>&1
1.1.4. OpenMPI multi-node MPI parallel script
#!/bin/bash # # Name of the job #SBATCH -J my_mpi_job # # Number of nodes #SBATCH --nodes=4 # # Number of MPI processes per node #SBATCH --ntasks-per-node=24 # # Priority #SBATCH --qos=short # # No sharing of the nodes #SBATCH --exclusive # # Wall clock limit #SBATCH --time=01:00:00 # # Mail address of the user #SBATCH --mail-user=my.mail@domain.fr # # Event notification #SBATCH --mail-type=all # # Standard output file #SBATCH -o job_mpi-%j.out module purge module load gcc/13.1.0 openmpi/ucx/4.1.5_gcc_8.5.0_ucx_1.14.1_rdma_46.0 # # To go to the running directory cd $SLURM_SUBMIT_DIR time srun ./my_mpi_app > ./output_ompi_`printf %03d ${SLURM_NTASKS}`_${SLURM_JOB_ID}.txt 2>&1
1.2. Hybrid MPI-OpenMP small example
1.2.1. Fortran
1.2.1.1. Compilation with Intel compiler / Intel MPI library
module load intel/compiler/2023.1.0 intel/mpi/2021.9.0 mpiifort -fc=ifx -O2 -qopenmp -traceback main_hyb.f90 -o a_ihyb.out
1.2.1.2. Compilation with GNU / OpenMPI Library
module load gcc/13.1.0 openmpi/ucx/4.1.5_gcc_8.5.0_ucx_1.14.1_rdma_46.0 mpif90 -O2 -g -fbacktrace -fopenmp main_hyb.f90 -o a_ohyb.out
1.2.1.3. Fortran source file mpi_hyb.f90
PROGRAM main_mpi use mpi !$ use omp_lib IMPLICIT NONE INTEGER :: ier INTEGER :: iMPI_MyRank, iOMP_MyRank INTEGER :: iMPI_Size, iOMP_Nbthds ! INTEGER :: ih, il INTEGER :: iver, isubver, iProvided ! CHARACTER(MPI_MAX_PROCESSOR_NAME) :: host_name CHARACTER(MPI_MAX_LIBRARY_VERSION_STRING) :: lib_version ! CALL MPI_Init_Thread (MPI_THREAD_MULTIPLE, iProvided, ier) CALL MPI_Comm_Size (MPI_COMM_WORLD, iMPI_Size, ier) CALL MPI_Comm_Rank (MPI_COMM_WORLD, iMPI_MyRank, ier) ! !Connaitre le nom de l'hote CALL MPI_Get_Processor_Name (host_name, ih, ier) ! !Connaitre la version de la version de la librairie MPI CALL MPI_Get_Library_Version (lib_version, il, ier) ! IF (iMPI_MyRank == 0) WRITE (6,'(A)') lib_version(1:il) ! !Connaitre la version de MPI CALL MPI_Get_Version (iver, isubver, ier) ! WRITE (6,'(2(I4,1x),A,2x,I1,A1,I1)') iMPI_MyRank, iMPI_Size, host_name(1:ih), iver, '.', isubver CALL MPI_Barrier (MPI_COMM_WORLD, ier) IF (IMPI_MyRank == 0) THEN WRITE (6,'(A)') '=====================================' CALL Flush (6) END IF ! !$OMP PARALLEL DEFAULT(NONE) & !$OMP SHARED (iMPI_MyRank) & !$OMP PRIVATE(iOMP_MyRank, iOMP_NbThds) ! !$ iOMP_MyRank = OMP_GET_THREAD_NUM() !$ iOMP_NbThds = OMP_GET_NUM_THREADS() ! !$OMP MASTER !$ IF (iMPI_MyRank == 0) THEN !$ WRITE (6,'(//,A,I3,//)') 'Number of OpenMP threads per MPI task : ', iOMP_NbThds !$ WRITE (6,'(A)') '=====================================' !$ END IF !$OMP END MASTER ! !$OMP END PARALLEL ! CALL MPI_Finalize (ier) ! STOP END PROGRAM main_mpi
1.2.2. C
1.2.2.1. Compilation with Intel compiler / Intel MPI library
module load intel/compiler/2023.1.0 intel/mpi/2021.9.0 mpiicc -cc=icx -O2 -qopenmp -traceback main_hyb.c -o a_ihyb.out
1.2.2.2. Compilation with GNU / OpenMPI Library
module load gcc/13.1.0 openmpi/ucx/4.1.5_gcc_8.5.0_ucx_1.14.1_rdma_46.0 mpicc -O2 -g -fopenmp main_hyb.c -o a_ohyb.out
1.2.2.3. C source file main_hyb.c
#include <stdio.h> #include <mpi.h> #include <omp.h> int main(int argc, char *argv[]) { int numprocs, rank, namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; int iam = 0, np = 1; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Get_processor_name(processor_name, &namelen); #pragma omp parallel default(shared) private(iam, np) { np = omp_get_num_threads(); iam = omp_get_thread_num(); printf("Hello from thread %d out of %d from process %d out of %d on %s\n", iam, np, rank, numprocs, processor_name); } MPI_Finalize(); }
1.2.3. Multi-node hybrid parallel scripting with Intel MPI
Here to launch a parallel hybrid MPI/OpenMP code on 384 cores which therefore uses 4 nodes: 48 MPI processes and 8 OpenMP threads per MPI process.
#!/bin/bash # # Name of the job #SBATCH -J my_hybrid_job # # Number of nodes #SBATCH --nodes=4 # # Number of MPI processes per node #SBATCH --ntasks-per-node=12 # Number of cores per MPI process #SBATCH --cpus-per-task=8 # # Priority #SBATCH --qos=short # # No sharing of the nodes #SBATCH --exclusive # # Wall clock limit #SBATCH --time=01:00:00 # # Mail address of the user #SBATCH --mail-user=my.mail@domain.fr # # Event notification #SBATCH --mail-type=all # # Standard output file #SBATCH -o job_hyb-%j.out # module purge module load intel/compiler/2023.1.0 intel/mpi/2021.9.0 # # To go to the running directory cd $SLURM_SUBMIT_DIR # # So that slurm and Intel MPI work together export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so # One gets the number of cores for each MPI task, i.e. the number of OpenMP threads export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} time srun ./my_hyb_app > ./output_impi-`printf %03d ${SLURM_NTASKS}`_omp-`printf %02d ${SLURM_CPUS_PER_TASK}`_${SLURM_JOB_ID}.txt 2>&1
1.2.4. Multi-node hybrid parallel scripting with OpenMPI
Here to launch a parallel hybrid MPI/OpenMP code on 384 cores which therefore uses 4 nodes: 48 MPI processes and 8 OpenMP threads per MPI process.
#!/bin/bash # # Name of the job #SBATCH -J my_hybrid_job # # Number of nodes #SBATCH --nodes=4 # # Number of MPI processes per node #SBATCH --ntasks-per-node=12 # Number of cores per MPI process #SBATCH --cpus-per-task=8 # # Priority #SBATCH --qos=short # # No sharing of the nodes #SBATCH --exclusive # # Wall clock limit #SBATCH --time=01:00:00 # # Mail address of the user #SBATCH --mail-user=my.mail@domain.fr # # Event notification #SBATCH --mail-type=all # # Standard output file #SBATCH -o job_hyb-%j.out # module purge module load gcc/13.1.0 openmpi/ucx/4.1.5_gcc_8.5.0_ucx_1.14.1_rdma_46.0 # # To go to the running directory cd $SLURM_SUBMIT_DIR # # One gets the number of cores for each MPI task, i.e. the number of OpenMP threads export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} time srun ./my_hyb_app > ./output_impi-`printf %03d ${SLURM_NTASKS}`_omp-`printf %02d ${SLURM_CPUS_PER_TASK}`_${SLURM_JOB_ID}.txt 2>&1
1.3. OpenMP small example
1.3.1. Fortran
1.3.1.1. Compilation with Intel compiler
module load intel/compiler/2023.1.0 ifx -O2 -qopenmp -traceback main_omp.f90 -o a_iomp.out
1.3.1.2. Compilation with GNU
module load gcc/13.1.0 gfortran -O2 -g -fbacktrace -fopenmp main_omp.f90 -o a_gomp.out
1.3.1.3. Fortran source file main_omp.f90
PROGRAM main_omp !$ USE omp_lib IMPLICIT NONE INTEGER :: iOMP_MyRank, iOMP_Nbthds ! ! !$OMP PARALLEL DEFAULT(NONE) & !$OMP PRIVATE (iOMP_NbThds, iOMP_MyRank) ! !$ iOMP_MyRank = OMP_GET_THREAD_NUM() !$ iOMP_NbThds = OMP_GET_NUM_THREADS() ! !$ WRITE (6,'(A,I3)') 'Hello from OpenMP thread : ', iOMP_MyRank !$OMP MASTER !$ WRITE (6,'(//,A,I3,//)') 'Number of OpenMP threads : ', iOMP_NbThds !$ WRITE (6,'(A)') '=====================================' !$OMP END MASTER ! !$OMP END PARALLEL ! STOP END PROGRAM main_omp
1.3.2. C
1.3.2.1. Compilation with Intel compiler
module load intel/compiler/2023.1.0 icx -O2 -qopenmp -traceback main_omp.c -o a_iomp.out
1.3.2.2. Compilation with GNU compiler
module load gcc/13.1.0 gcc -O2 -g -fopenmp main_omp.c -o a_gomp.out
1.3.2.3. C source file main_omp.c
#include <stdio.h> #include <omp.h> int main(int argc, char *argv[]) { int iam = 0, np = 1; #pragma omp parallel default(shared) private(iam, np) { np = omp_get_num_threads(); iam = omp_get_thread_num(); printf("Hello from thread %d out of %d OpenMP threads\n", iam, np); } }
1.3.3. Intel intra-node parallel scripting
This script is used for parallelized calculations in shared memory like OpenMP
or pthreads
.
#!/bin/bash # # Name of the job #SBATCH -J my_openmp_job # # Number of nodes #SBATCH --nodes=1 # # Number of tasks per node (only one task) #SBATCH --ntasks-per-node=1 # Number of cores per task (as many cores as OpenMP threads) #SBATCH --cpus-per-task=32 # # Priority #SBATCH --qos=short # # No sharing of the nodes #SBATCH --exclusive # # Wall clock limit #SBATCH --time=01:00:00 # # Mail address of the user #SBATCH --mail-user=my.mail@domain.fr # # Event notification #SBATCH --mail-type=all # # Standard output file #SBATCH -o job_omp-%j.out # module purge module load intel/compiler/2023.1.0 # # To go to the running directory cd $SLURM_SUBMIT_DIR # One gets the number of cores export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} time ./my_openmp_app > output_`printf %02d ${OMP_NUM_THREADS}`-${SLURM_JOB_ID} 2>&1
1.3.4. GNU intra-node parallel scripting
#!/bin/bash # # Name of the job #SBATCH -J my_openmp_job # # Number of nodes #SBATCH --nodes=1 # # Number of tasks per node (only one task) #SBATCH --ntasks-per-node=1 # Number of cores per task (as many cores as OpenMP threads) #SBATCH --cpus-per-task=32 # # Priority #SBATCH --qos=short # # No sharing of the nodes #SBATCH --exclusive # # Wall clock limit #SBATCH --time=01:00:00 # # Mail address of the user #SBATCH --mail-user=my.mail@domain.fr # # Event notification #SBATCH --mail-type=all # # Standard output file #SBATCH -o job_omp-%j.out # module purge module load gcc/13.1.0 # # To go to the running directory cd $SLURM_SUBMIT_DIR # One gets the number of cores export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} time ./my_openmp_app > output_`printf %02d ${OMP_NUM_THREADS}`-${SLURM_JOB_ID} 2>&1