GPU jobs¶
CC-IN2P3 provides a CentOS 7-based GPU computing platform that consists of 2 types of GPUs:
- 10 Dell C4130 with 4 GPUs and 16 CPU cores per compute server
- 2 Xeon E5-2640v3 (8c @ 2.6 Ghz)
- 128 GB RAM
- 2 Nvidia Tesla K80 → 4 GPU Nvidia GK210 with 12 GB DDR5
- InfiniBand between the nodes
- 6 Dell C4140 with 4 GPUs and 20 CPU cores per compute server
- 2 Xeon Silver 4114 (10c @ 2.2 GHz)
- 192 GB RAM
- 4 NVidia Tesla V100 PCIe → 4 Nvidia GPUs with 32 GB HBM2
- No InfiniBand
The user chooses the GPU type at submission time:
-l GPUtype=K80 or -l GPUtype=V100
The queues available for GPU jobs are the following:
- interactive GPU jobs :
mc_gpu_interactive
- multi-core GPU jobs :
mc_gpu_medium
mc_gpu_long
mc_gpu_longlasting
- parallel GPU Jobs :
pa_gpu_long
To know the queues limits please refer to the page Information on scheduling queues.
Attention
With the exception of mc_gpu_interactive
, the access to all GPU queues is restricted. You must contact your czar to access this type of resource. See Restricted queues FAQ.
Jobs requesting access to GPU compute servers must use them as efficiently as possible. In fact, GPUs monopolized by an inefficient job are wasted. To check if your job uses GPUs:
- nvidia-smi allows you to visualize the efficiency of your interactive jobs in real time;
- the end of the job cartridge (see the output file) gives you metrics about the job efficiency.
The CUDA 10.1 and OpenCL 1.2 environments are available in /opt/cuda-10.1
. To compile in CUDA, please refer to the dedicated paragraph.
CUDA is updated regularly, but the n-1 version is still kept to meet specific needs. Currently, CUDA 9.2 is available in /opt/cuda-9.2
.
To take advantage of software unavailable on the computing platform, or to use an earlier version, CC-IN2P3 offers the Singularity virtualization solution.
If you need to use CUDA or OpenCL libraries, specify in your script:
bash
if! echo $ {LD_LIBRARY_PATH} | /bin/grep -q /opt/cuda-10.1/lib64 ; then LD_LIBRARY_PATH=/opt/cuda-10.1/lib64:${LD_LIBRARY_PATH} fi
csh
if ($?LD_LIBRARY_PATH) then setenv LD_LIBRARY_PATH /opt/cuda-10.1/lib64:${LD_LIBRARY_PATH} else setenv LD_LIBRARY_PATH /opt/cuda-10.1/lib64 endif
Interactive GPU jobs¶
Interactive GPU jobs are started with the qlogin
command (read also the page Interactive jobs) and by choosing the mc_gpu_interactive
queue, for example:
% qlogin -l GPU=<number_of_gpus> -l GPUtype=<gpu_type> -q mc_gpu_interactive -pe multicores_gpu 4
Example:
% qlogin -l GPU=1 -l GPUtype=V100 -q mc_gpu_interactive -pe multicores_gpu 4
Interactive job submissions must request exactly 4 CPUs (-pe multicores_gpu 4
) to be executable.
Multi-core GPU jobs¶
To submit a GPU job, you must specify the GPU queue (for example -q mc_gpu_long
), the number of GPUs required
(for example -l GPU=2
, up to 4 GPUs are available per server) and the dedicated multi-core environment (-pe multicores_gpu
).
In summary, the qsub options are:
% qsub -l GPU=<number_of_gpus> -l GPUtype=<gpu_type> -q <QueueName> -pe multicores_gpu 4 ...
The CUDA_VISIBLE_DEVICES
variable is set automatically.
Example of submission:
% qsub -l GPU=2 -l GPUtype=K80 -q mc_gpu_long -pe multicores_gpu 4 ...
Multicore job submissions must request exactly 4 CPUs (-pe multicores_gpu 4
) to be executable. However, this request does not constrain the job, so it can actually use more or less than 4 CPUs, depending on its needs; please only avoid occupying all the CPUs of a server (16) if you do not use all the GPUs.
Parallel GPU Jobs¶
To submit a parallel GPU job, you must specify:
- the queue
-q pa_gpu_long
- the number of GPUs desired per server
-l GPU=x
, with 1 ≤ x ≤ 4 - the openmpigpu_4 environment which will be used to determine the number of calculation server wanted
-pe openmpigpu_4 x
, with x = 4 times the number of servers you want to use - the type of GPU available for parallel jobs is only K80:
-l GPUtype=K80
Your script must contain some OpenMPI-specific directives (including the launch of MPIEXEC), which are specified in the section: Parallel jobs.
The options are:
% qsub -l GPU=<number_of_gpus_per_node> -l GPUtype=K80 -q pa_gpu_long -pe openmpigpu_4 <number_of_servers_times_4> ...
Example:
% qsub -l GPU=3 -l GPUtype=K80 -q pa_gpu_long -pe openmpigpu_4 8 my_script_for_2_nodes_and_6_GPU.sh
Compile in CUDA¶
To compile your CUDA code, you should connect to the interactive GPU server with the following command line example:
qlogin -l GPU=1 -l GPUtype=K80 -q mc_gpu_interactive -pe multicores_gpu 4
Then you will be connected by SSH to the server, and you will be able to compile your code with the nvcc
compiler:
% /opt/cuda-10.1/bin/nvcc
Once the code is compiled, we recommend you to exit the interactive server and submit your jobs with qsub
.