logo glicid

   constraint slurm

1. Constraints use

To easily launch jobs from any front end, we have implemented constraints. These constraints allow you to target the desired nodes, especially if you are not on the target cluster. To use them, add the Slurm --constraint=<constraint_name> option

Below is a list of constrained slurms. This can evolve like the hardware installed in the cluster.

If you want to have them constrained on a specific node you can use a specific slurm command and .

Example : To know the constraint on gnode2 on nautilus cluster :

 scontrol --cluster=nautilus show node gnode2 |grep ActiveFeatures

Result :

  ActiveFeatures=loc_ecn,cpu_amd,cpu_zen4,cpu_genoa,cpu_9474,net_ib,net_100g,gpu_a100,gpu_a100_80

1.1. Location of equipment

Allows you to consider the nodes hosted in a specific machine room. Interesting to specify when resources are primarily available at this specific location

constraint_name

meaning

loc_maths

historic room of CCIPL

loc_dc

Nantes University data center

loc_ecn

machine room of the Central School

1.1.1. CPU type

Allows you to target either the brand of processors or their characteristics.

1.1.1.1. Brand

constraint_name

meaning

cpu_intel

“intel” brand cpu

cpu_amd

“amd” brand cpu

1.1.1.2. Extensions

constraint_name

meaning

cpu_avx

avx extensions are required

cpu_avx2 or cpu_avx256

avx2 extensions are required

cpu_avx512

avx512 extensions are required

1.1.1.3. Generations

constraint_name

meaning

cpu_westmere, cpu_X5650

Intel, génération westmere

cpu_broadwell, cpu_e5_2530v4 or cpu_e5_2640v4

Intel, génération Broadwell

cpu_skylake, cpu_silver_4114

Intel, génération skylake

cpu_cascadelake, cpu_silver_4210 or cpu_silver_4210r or cpu_silver_4216

Intel, génération cascadelake

cpu_zen2, cpu_rome, cpu_7282

AMD zen2

cpu_zen3, cpu_milan ,cpu_7213

AMD zen3

1.1.2. Rapid interconnection network

The network is of three types :

  • Infiniband (IB)

  • Omnipath (OPA)

  • RoCE (Roce)

These 3 networks are incompatible with each other. In the case of multi-node work, it is important to aim for a particular network type. It is recommended to use a universal MPI strain that is capable of efficiently driving any type of network (OpenMPI).

1.1.3. Type

constraint_name

meaning

net_ib

infiniband

net_opa

omnipath

net_roce

roce

net_dr

dual rail (whatever the technology)

1.1.4. Interconnect speed

constraint_name

meaning

net_25g

25 gbit/s (Roce)

net_40g

40 gbit/s (Infiniband QDR)

net_50g

50 gbit/s (Roce in dual-rail)

net_100g

100 gbit/s (Omnipath 100 or RoCE 100)

1.1.5. GPU

When it comes to GPUs, using GRES can be useful and allows you to be more precise.

constraint_name

meaning

gpu_k40

Nvidia K40

gpu_k80

Nvidia K80

gpu_p100

Nvidia P100

gpu_t4

Nvidia T4

gpu_a40

Nvidia A40

gpu_a100

Nvidia A100

1.1.6. Hardware

currently being defined

constraint_name

meaning

dell

DELL compute

asus

ASUS compute

hpe

HPE compute

sgi

SGI compute

1.2. Waves Table of constraints

Table of upcoming constraints (some purely proprietary nodes are omitted). When “or”s are placed, this means that only a part of these nodes has these properties

machines

index

loc_

cpu_

net_

gpu_

hw_

chezine

001-078

maths

intel, westmere,x5650

ib, 40g

sgi

nazare

001-128

dc

intel, broadwell, avx, avx2, e52630v4

opa, opa100, 100g

asus or dell

cribbar

001-100

dc

intel, (skylake,silver4114) or (cascadelake, cpu_silver4210 or cpu_silver4210R) avx, avx2, avx512

(opa, opa100, 100g) or (roce, (roce25, 25g) or (roce50, dr, 50g))

dell

cloudbreak

001-040

dc

amd, rome or milan, zen2 or zen3, 7282 or 7353

roce, (roce25, 25g) or (roce50, dr, 50g)

dell or hpe

budbud

001-023

dc or ecn

(intel (broadwell,e52640v4 or skylake,silver4114 or cascadelake,silver4210)) or (amd,zen3,milan,7313)

(opa, opa100, 100g) or (roce,(roce25, 25g) or (roce100,100g))

k40 or p100 or t4 or a40 or a100

dell

1.3. Nautilus Table of constraints

Type

Name

Core per Node

Ram per Node

GPU

constraints

standard

cnode[301-340]

96

384 Go

None

loc_ecn cpu_amd cpu_zen4 cpu_genoa cpu_9474 net_ib net_100g

BigMem

cnode[701-708]

96

768 Go

None

loc_ecn cpu_amd cpu_zen4 cpu_genoa cpu_9474 net_ib net_100g

Gpus

gnode[1-4]

96

768 Go

4 * Nvidia Tesla A100 80G

loc_ecn cpu_amd cpu_zen4 cpu_genoa cpu_9474 net_ib net_100g gpu_a100 gpu_a100_80

VIsulalisation

visu[1-4]

96

768 Go

2 * Nvidia Tesla A40 48G

loc_ecn cpu_amd cpu_zen4 cpu_genoa cpu_9474 net_ib net_100g gpu_a40