Using the GE batch system at the computing centre



In order to use the GE (Grid Engine) batch system at the computing centre, you must have a local AFS account.

In the general sense, a job stands for a task (or a set of tasks) the user wishes to perform on a computing machine of the Computing Centre. It could be an executable file, a set of commands, a script,… This job can be preliminary developed and tested on the interactive access machines of the Computing Centre before being submitted massively on the computing farm.

Grid Engine – The job scheduler

Grid Engine is a job scheduler. The scheduler is the only entry point common to all users to submit jobs on the farm. Its role is to receive the jobs submitted by users, to schedule and to submit them for execution on an appropriate and available computing machine (worker).

Its main purpose is to use the computing resources (memory, disk space, CPU) in the most efficient way. The mutualisation of all resources for all users allows an optimal use of computing machines and the whole farm.

For what kind of jobs?

The Grid Engine system accepts the following four basic classes of jobs:

  • Sequential single-core jobs – Traditional job. Scripts executed sequentially, typically once, on a worker node.
  • Multi-cores jobs – Jobs using several cores sharing memory on a single machine.
  • Parallel Jobs – Jobs composed of cooperating tasks that must all be executed at the same time, often with requirements about how the tasks are distributed across the resources. Note that not just any program can run in parallel, it must be programmed as such and compiled against a particular MPI library.
  • Array Jobs – Groups of similar work segments that can all be run in parallel but are completely independent of one another. All of the workload segments of an array job, known as tasks, are identical except for the arguments they take. Sequential, multi-cores and parallel jobs can be launched as array jobs.
  • Interactive Jobs – Jobs that provide an interactive login to a worker node in the compute cluster. You can submit an interactive job when you are building and testing your scripts. But note that this is not the place to run long, computationally intensive or other jobs better suited to run in a batch mode.

Two computing platforms are available: Scientific Linux SL6 (64-bit, but 32-bit compatible), and CentOS 7 (64-bit only).
For more details, see http://cctools.in2p3.fr/mrtguser/info_sge_parc.php .

How are the jobs executed?

Every job is submitted into an execution queue. Every queue has default values for disk space, CPU time and memory. There are queues dedicated to jobs that needs large quantity of resources (CPU, memory, or several cores), and they have restricted access. In order for a job to execute in a restricted-access queue, the user must be in the corresponding userset (see section on queues for more details).

Once submitted, a job is automatically checked to verify if it is allowed to execute in a particular queue:

  • If the user requests a particular queue :
    • If the user is authorised, the scheduler check if the resources required by the job match with the resources provided by the queue
      • If they match, and if ressources are available, the job is executed
      • If not, the job stays in queue
    • if the user is not authorised, the job stays in queue
  • In the user does not request a particular queue:
    • The scheduler consider the first queue that match with the request
      • If no queue match the request, the job stays in queue

Every queue allows the simultaneous execution of many jobs. The batch system always try to execute new jobs in a queue which is less loaded and the most appropriate.

Connecting on an interactive node

In order to submit jobs on the computing farm, you must have a local AFS account and connect on an interactive node.

To build and briefly test your code, or submit jobs, you can open an interactive session by connecting on an interactive node cca.in2p3.fr (see also the section Interactive job ):

> ssh [-X] <loginname>@cca.in2p3.fr

To use SL6 rather than CentOS 7, connect on cca6.in2p3.fr.

Grid Engine Environment

Once connected, you are ready to use Grid Engine. The GE environnement is defined automatically when you log in to an interactive machine. But in case you don't use CC-IN2P3 default environnement, you have to set the SGE environment with:

  • If you are using the csh or tcsh SHELL :
> source /usr/local/shared/bin/ge_env.csh
  • If you are using the sh, ksh or bash SHELL :
> source /usr/local/shared/bin/ge_env.sh
  • Or just:
> ge_env

This setup your $PATH and $MANPATH, and define other environment variables necessary for Grid Engine to work. You can include this command in your .cshrc (for csh or tcsh) or .profile (for sk, ksh or bash) file to automatically define this environment when connecting.

If your jobs need your shell profiles, you'll have to add at the beginning of the script (for bash) :

#! /usr/local/bin/bash -l 

or use the “-S” option at submission:

> qsub -S "/usr/local/bin/bash -l" 

Note also that the variable BATCH_SYSTEM is defined as GE, and the variable ENVIRONMENT is defined as ACCESS for the interactive machines. For the jobs, depending on their category, ENVIRONMENT is defined as SEQUENTIAL_BATCH, PARALLEL_BATCH or INTERACTIVE_BATCH.

Grid Engine is very easy to use. The most useful commands are the following:

  • qsub - submit a job
  • qstat - verify the status of a job currently in queue or executing
  • qacct - verify the status of a completed job
  • qdel - remove a job from the queue

For a detailed description of these commands with all their available options, please see the manual of the relevant command:

> man <commandname>

To modify the resources requested by a queuing job, suspend or resume a job, see the section Advanced commands. See also the list of GE commands.

Submit a job: qsub

A job is submitted using the command qsub with the following syntax:

> qsub -P P_<myproject> [ options ] [ scriptfile | -- [ script args ]] 

The option “-P P_<myproject>” is used to declare your project; the project “myproject” is usually the same as your group, but it can also be a subgroup if they exist. This option is mandatory.
Other options are not mandatory, but can be very important (see useful options).
If no filename is given, the qsub command waits for you to enter the commands you want to execute. Once they are entered, you can type Ctrl-D to close the prompt.

In order to know your project, type:

> qconf -suser <loginname>

NB. Your loginname appears is this list once you have submitted your first job. Otherwise, search for your project name within the existing ones.

If you belong to several groups, you can use the command “newgroup” to switch to a group corresponding to the project used with qsub (see https://doc.cc.in2p3.fr/en:afs_changer_de_groupe).

In order to see the exiting projects, type:

> qconf -sprjl

In order to see the configuration of a specific project, type:

> qconf -sprj <projectname>

To know more about the possible qsub options, see:

> man submit

Or see the online man page.

First script : Hello world!

  • Create a directory, for instance ge_tests
  • Open a text editor (vi, emacs, …)
  • Copy-paste the following script:
#!/bin/bash

echo "\nHello World!\n"

echo 'My working directory is: ' 
pwd
echo 'on the node: ' 
hostname

sleep 60
echo "Done!!"
  • Name the file hello_world.sh and make it executable
  • Launch the command:
> qsub -P P_<myproject> hello_world.sh

Once the job submitted, the batch system will return the jobid of the job:

Your job 5947764 ("hello_world.sh") has been submitted

Using this jobid you can some informations concerning the job ( qstat / qacct ), you can modify the resources that have been requested ( qalter ), or remove the job ( qdel ).

Verify the status of running jobs: qstat

The qstat command provides information on current tasks:

> qstat 

You will get some informations on the current status of your job:

job-ID  prior   name       user  state submit/start at     queue slots ja-task-
-------------------------------------------------------------------------------
5947764 0.00000 hello_worl login  qw   01/31/2012 16:22:23         1        

In this case, the status is qw (queued and waiting).

Usual status are:
r : running, job en exécution
Rr : job running qui a été relancé
qw : queued and waiting, job en queue
Eqw : job en erreur en queue

Other possible status are listed here:

Category State GE Letter Code
Pending
pending qw
pending, user hold hqw
pending, system hold hqw
pending, user and system hold hqw
pending, user hold, re-queue hRwq
pending, system hold, re-queue hRwq
pending, user and system hold, re-queue hRwq
Running
running r
transferring t
running, re-submit Rr
transferring, re-submit Rt
Suspended
job suspended s, ts
queue suspended S, tS
queue suspended by alarm T, tT
all suspended with re-submit Rs, Rts, RS, RtS, RT, RtT
Error
all pending states with errors Eqw, Ehqw, EhRqw
Deleted
all running and suspended states with deletiondr, dt, dRr, dRt, ds, dS, dT, dRs, dRS, dRT



To show jobs in a specific state, type:

> qstat -s p/r/s

with “p” for pending, “r” for running, and “s” for suspended. To show the jobs of a group:

> qstat -u @<groupname>

To obtain informations about a specific job, in queue or executing, type:

> qstat -j <jobid>

The “-nenv” reduce verbosity. To obtain informations on the resources requested by a job:

> qstat -r

and other informations on the jobs (the project which is used for instance):

> qstat -ext

To see all possible options, see:

> man qstat

or see the corresponding online man page.

Obtain informations on the completed jobs: qacct

The qacct command provides informations on the past usage of GE. An accounting is registered for every completed job. This accounting file contains data on jobs that ended during the last 5 days. Data for older jobs are kept in other accounting files, one per month. By default, the command allows to access data about jobs that ended in the last 5 days.

In order to have informations on a specific jobs completed in the last 5 days, type:

> qacct -j <jobid>

To access data about older jobs, use the option “-f”:

> qacct -o <loginname> -j -f /opt/sge/ccin2p3/common/accounting.YYYY.MM

For instance, in order to see informations on all your jobs in the last 10 days:

> qacct -o <loginname> -j -d 10 -f /opt/sge/ccin2p3/common/accounting.YYYY.MM

Note: the CPU time is expressed in HS06.secondes. See the section on the CPU consumption of jobs for more details.

In the output of the “qacct -j” command, there are two lines, “failed” and “exit_status”, that can help to understand why a job failed. If both are equal to 0, it means the job correctly executed and successfully completed:

...
failed       0    
exit_status  0  
...

Otherwise, there has been a problem:

- exit_status -

Frequent error codes are:

  • If a soft limit has been exceeded, the following error codes are returned;
exit_status corresponds to
152 = 24 (SIGXCPU) + 128SIGXCPU : exceeded cpu time ( _cpu ) or memory ( _rss )
138 = 10 (SIGUSR1) + 128SIGUSR1 : exceeded elapsed time ( _rt )
153 = 25 (SIGXFSZ) + 128SIGXFSZ : exceeded file size ( _fsize )
  • If a hard limit has been exceeded, a signal SIGKILL 137 = 9 (SIGKILL) + 128 is sent
  • exit_status that are lower than 128 are defined by the user.



- failed -

failed indicates the error in case the job could not start on the execution node:

failed explanation
1 : assumedly before job Job could not be started
7 : before prolog Job could not be started
8 : in prolog Job could not be started
10 : in pestart Job could not be started
19 : before writing exit_status
21 : in recognizing job
25 : rescheduling Job ran, job will be rescheduled
26 : opening input/output le Job could not be started, stderr/stdout le could not be opened
28 : changing into working directoryJob could not be started, error changing to start directory
29 : invalid execution state
37 : qmaster enforced h rt limit
100 : assumedly after job Job ran, job killed by a signal


To knows more about the possible options of the qacct command, see:

> man qacct

or see the corresponding online man page.

To understand the format of the qacct command output, type:

> man accounting

Delete a job: qdel

Jobs can be deleted with the qdel command. Users can delete only their own jobs.

To delete all of your jobs:

> qdel -u <loginname>

To delete one or several jobs:

> qdel <jobid>[,jobid,...]

To delete a task of an array job ( see section Array jobs ):

> qdel <jobid>.<taskid>

To delete several tasks of an array job:

> qdel <jobid>.<taskid_first>-<taskid_last>[:interval]
# or with the option "-t" :
> qdel <jobid> -t <taskid first>-<taskid last>[:interval]

To get more details on the qdel options, see:

> man qdel

or see the online man page.

Jobs standard outputs

By default, when submitting a batch job, two files are created in your $HOME when the job completed:
<jobname>.o<jobid> (standard output stdout)
<jobname>.e<jobid> (standard error stderr)

When submitting an array job, there will be as many files as tasks. These files will be named:
<jobname>.o<jobid>.<taskid>
<jobname>.e<jobid>.<taskid>

The output of your first job Hello world! can be found in your $HOME. You will find two files:

hello_world.sh.e<jobid>   # empty, if there is no error
hello_world.sh.o<jobid>

These outputs can be read with the cat command (for instance):

> cat hello_world.sh.o5947764  
***************************************************************
*                  Grid Engine Batch System                 
*           IN2P3 Computing Centre, Villeurbanne FR         
***************************************************************
* User:                    toto                         
* Group:                   ccin2p3                          
* Jobname:                 hello_world.sh                   
* JobID:                   5947764                          
* Queue:                   long                            
* Worker:                  ccwsge0632.in2p3.fr              
* Operating system:        Linux 3.10.0-693.5.2.el7.x86_64        
* Project:                 P_ccin2p3                        
***************************************************************
* Submitted on:            Tue Jan 31 16:22:23 2017         
* Started on:              Tue Jan 31 16:29:15 2017         
***************************************************************

Hello World!

My working directory is: 
/scratch/5947764.1.long
on the node: 
ccwsge0632
Done!!

***************************************************************
* Ended on:                Tue Jan 31 16:29:15 2017         
* Exit status:             0                                
* Consumed                                                  
*   cpu (HS06):            11:34:10                         
*   cpu scaling factor:    11.350000                        
*   cpu time:              3669 / 259200                    
*   efficiency:            90 %                             
*   io:                    13.87236                         
*   vmem:                  1.129G                           
*   maxvmem:               6.240G                           
*   maxrss:                3.473G                           
***************************************************************

The standard output stdout and stderr are written in the spool directory during the job execution, and at then end of the job copied in $HOME by default. To modify the directory where these outputs are copied, and change the name of your job, see the section about useful options below.

Note that it is recommended to have stdout and stderr of reasonable size (less than 100 Mo). Larger files must be redirected in the $TMPDIR rather than the $HOME.

Useful options in GE

The qsub, qalter and qlogin commands have many possible options. The complete list can be obtained with:

> man submit

or through the onlineman page.

There are several ways to pass options to GE:

1) Using the command line: just after a qsub or qalter command:

> qsub -l sps=1 scriptfile
> qalter -l hpss=1,sps=1 <jobid>

2) In your script:

#$ -l sps=1

In this case, options can be given at the beginning of the file, before the other lines of the script. Lines defining options must start with the prefix “#$”. The prefix can be chosen with the option “-C”.

3) In a specific file .sge_request:

-l sps=1

This file is relevant for options that should be given by default like the project name. For more details, see:

> man sge_request

or see the online man page.

The most frequently used GE options are listed here:

Options Explanation
-C prefix allows to modify the prefix used in scripts in order to define GE options. The default prefix is “#$”.
-N jobname allows to specify the job name. By default the name of the script is used.
-S /path/shell allows to specify the shell that will be used by the job.
-cwd with this option the job will work in the submission directory, to use only for the parallel jobs. By default, outputs will be in $HOME.
-e [/ path /] fileallows to define the path of the stderr file.
-o [/ path /] fileallows to define the path of the stdout file.
-j [y,n]
-r [y,n] allows to specify if a job that fails after a system issue will be relaunched. It is set to “y” by default, excepted for interactive jobs.
-a date_time allows to submit a job that will be executed at a defined date [[CC]]YY]MMDDhhmm[.SS].
-M emailaddress send an email to this address with some condition (see option “-m”).
-m {b,e,a,s,n} “b”: when the job starts, “e”: when the job ends, “a”: when the job is aborted, “s”: when the job is suspended, “n”: no mail is sent (by default). Be careful, this option can cause overloading in case of massive mail sending.
-q queuename allows to define an execution queue, only for parallel and multi-core jobs.
-pe PE n allows to specify the parallel environment (PE) and the number of slots to be used by the job. See parallel and multi-core jobs.
-l resource=value ,…allows to request resources for a job, see computing resources.
-V in order to pass all environment variables to the job.
-p priority allows to reduce the priority of the job (0 by default, only negative values are possible).



An example of submission script with some options:

#!/bin/sh
#####################################
# job script example with GE options
#####################################

#$ -S "/usr/local/bin/bash -l" 
#$ -N jobname
#$ -o jobname.out
#$ -e jobname.err 
#$ -r y
#$ -M myname@mylabo.fr
#$ -m be   ## send an email when the job starts and ends

####################################
### script here ...


### end

The “-clear” option of qsub allows to ignore all other options defined before (in a script for instance), in order to take into account only the options defined after.

Most of the jobs need to use computing resources that are shared with other users on the cluster.

In order to know the available resources (= complex in the GE language) and their configurations, type:

> qconf -sc

Or see the page http://cctools.in2p3.fr/mrtguser/info_sge_complex.php .

Resources can be requestable and consumable (see the output of “qconf -sc”):
- The requestable column ( YES / NO ) indicates if users can request this resource for their jobs.
- The consumable column indicates if the attribute can be consumed ( YES / NO ). In this case, there is a limit on the maximal quantity available for this resource. GE keeps an accounting of the usage of this resource for all running jobs and ensure that jobs start only if there are enough resource available.
- The default column indicates the default value used if no value is requested for this attribute. This value is relevant only for consumable resources.

For some resources hard ( h_ ) and soft ( s_ ) limits can be set. By default all limits are considered as hard. If some hard of soft limits can not be satisfied for a job, the job will stay in queue. When a hard limit is reached during execution, the job is killed. When a soft limit is reached, GE send a SIGUSR1 signal to the job that can terminate gracefully (without reaching the hard limit).

Group resources

There is no specific resource to declare for each group. Group limits are managed by the batch system based on the project (the “-P” option). To see the resources quota (RQS, Resource Quota Set) for your group, type:

> qconf -srqs | grep -i <groupname>   # <groupname> without prefix "P_" 

or see http://cctools.in2p3.fr/mrtguser/info_sge_complex.php, and choose your group.

Resources declaration: syntax

All the resources that will be used by the job must be declared. They must be specified when submitting a job (see qsub), including interactive jobs. They can also be modified for a queuing job (see qalter).

When submitting a job, resources must be declared in the following way:

> qsub -l resource=value scriptfile

NB. The option “-l” (lower “L”) is given only once, even when declaring several resources:

> qsub -l resource1=value1,resource2=value2,resource3=value3 scriptfile

Declaration of operating system resource

The available Operating System on the batch farm are Scientific Linux 6 and CentOS 7. SL6 is the default one, but CentOS 7 can be requested by declaring the corresponding resource:

> qsub -l os=cl7 test.sh

It is also possible to request both Operating Systems:

> qsub -l os="sl6|cl7" test.sh

In which case one or the other will be used depending on the available resources.

Declaration of storage and software resource

For storage – dcache, hpss, irods, mysql, oracle, sps, xrootd – and software – idl, matlab – resources, the value “1” must be given. For example:

> qsub -l hpss=1,sps=1 test.sh
> qsub -l matlab=1 test.sh

The default value is “0” (resource not used).

Computing resources declaration: CPU, MEMORY, DISK

If you do not declare computing resources (CPU, MEMORY, DISK) the batch system will allocate the limit of the requested queue (see queues).

Limits can change with time depending on the worker nodes configuration. Any job that reach a limit is killed by GE.

To know the CPU, MEMORY, DISK needed by your job, you can do short time tests on an interactive machine or short time submissions on Grid Engine. Then the information concerning the ressources used are in the end banner of the log file you get when the job is finished.

- CPU -

To specify the cputime (the CPU time the job needs), for example to ask 1 hour and 40 minutes, use:

-l h_cpu=6000 # cpu time in seconds

or

-l h_cpu=01:40:00 # cpu time in hh:mm:ss

When the hard limit h_cpu is reached, GE send a SIGKILL signal to the job which is immediately killed (exit_status 137).
If you would like your job to be warned in order to properly terminate before being killed, you need to specify the soft limit s_cpu with a value lower than the h_cpu. If s_cpu is reached, a signal SIGXCPU is sent to the job (exit_status 152).

In order to know the hard and soft limits, type:

> qconf -sq "*" | egrep "qname|s_cpu|h_cpu" 


- MEMORY -

To specify the maximum memory needed for your job, for example to ask for 2 GB, use:

-l h_rss=2G

The default unit is a byte. Other possible units are K(ilo), M(ega), G(iga).
NB: at least 64M must be requested.

When the hard limit h_rss is reached, GE send a SIGKILL signal to the job which is immediately killed (exit_status 137).
If you would like your job to be warned in order to properly terminate before being killed, you need to specify the soft limit s_rss with a value lower than the h_rss. If s_rss is reached, a signal SIGXCPU (exit_status 152) is sent to the job.

- DISK -

During job execution on a machine of the farm, a local disk space of the machine is allocated. You can write and read the data on that space. It is accessible to your task through the environment variable ${TMPDIR} . This space is cleaned at the end of your task by GE. You will need to copy the data you want to keep in an appropriate storage space. All the machines of the farm have not the same available space. In order to optimize your job submission, it is important to specify your needs. For example if you need a maximum of 4096 MBytes (or 4 GBytes), you must use:

-l fsize=4096M

The default unit is a byte. Other possible units are K(ilo), M(ega), G(iga).
NB: at least 64M must be requested. No files you read or write, even in case of distant access, should exceed this limit. Otherwise, your job will be killed.

An execution queue correspond to default values for disk space, CPU time, memory, etc. and has different limits for these resources. A job is always running in a given queue.
However, when submitting a job, it is not necessary to specify the execution queue. The batch system will find the optimal queue depending on the needed computing resources.
The queue must be declared only if you need to use a specific queue dedicated to parallel, multi-cores or demons jobs. The queue will then be specified with the option “-q”:

> qsub -q <queuename> ... 

Some queues have restricted access (GPU, parallel, multi-cores, demon, longlasting, huge), and you need to ask your czar to obtain this access. When using these queues, the batch system check if you have access to the queue or not. It is not necessary to declare the queue.

To see a list of available queues, type:

> qconf -sql

To see the properties of a particular queue, type:

> qconf -sq <queuename>

Or see http://cctools.in2p3.fr/mrtguser/info_sge_queue.php and choose a particular queue to see more details.
If, in these lists of queue, the attribute is not equal to NONE, it means that the queue has restricted access.
Queues that
- start by “mc_” are for multi-cores jobs
- start by “pa_” are for parallel jobs
- contain “gpu” are for GPU jobs
- contain “interactive” are for interactive jobs.

For multi-cores and parallel queues, CPU and memory are expressed by cores, and disk space is global for the whole job.
Note that the demon queue requires to declare the corresponding resource:

-q demon -l demon=1

Example of a batch job (serial job)

#! /usr/local/bin/bash -l 

# To specify 10 minutes of 'wallclock time'
#$ -l h_rt=00:10:00

# To ask for 1 gigabyte of memory
#$ -l s_rss=1G

# To name the job
#$ -N serial_exemple

# To declare the project in which the job will be running
#$ -P P_<groupname>

# To merge stdout and stderr in a single file
#$ -j y

# stdout and stderr are located in $HOME by default.
# to write them in a different directory:
#$ -o $HOME/<yourPATH>/

# Execute the command and save the result
echo $(/bin/date) > result.txt

# Your job run in the temporary working directory ($TMPDIR).
# It is located on the worker and is erased at the end of the job.
# At the end of your job you must copy the results in a different storage space:
cp result.txt $HOME/myresult

Passing parameters to qsub

Parameters that must be given to the commands must be given after the name of the script in the qsub command:

> qsub $HOME/test.sh arg1 arg2 arg3

Than arg1 can be retrieved using the variable $1, arg2 with $2, etc.
GE options must be given before the name of the script.

Interactive job

To develop and test your script you can use an interactive session with GE. You must first connect to an interactive node cca.in2p3.fr.
To launch an interactive job, type:

> qlogin [ options ]

Or if you use SL6, connect to cca6.in2p3.fr and type:

> qlogin -l os=cl7 [ options ] 

As for regular jobs you must specify your resources with the option “-l”. For example, in order to ask for 1 hour of CPU time, 1 GB of memory and 1 GB of disk:

> qlogin -l s_fsize=1G,s_cpu=1:00:00,s_rss=1G

You will be connected directly on the worker, and your interactive job will run like a batch job. You will not be able to have files larger than the s_fsize option.

You will also have a jobid like other kind of jobs:

Your job 6043392 ("QLOGIN") has been submitted

You sometimes have to wait a little bit for the resources to be available, and then the session opens:

> qlogin -l s_fsize=1G,s_cpu=1:00:00,s_rss=1G
local configuration cca01.in2p3.fr not defined - using global configuration
JSV "/opt/sge/util/resources/jsv/corebinding.jsv" has been started
JSV "/opt/sge/util/resources/jsv/corebinding.jsv" has been stopped
Your job 6043392 ("QLOGIN") has been submitted
waiting for interactive job to be scheduled ...timeout (3 s) expired while waiting on socket fd 4
.timeout (11 s) expired while waiting on socket fd 4
.timeout (19 s) expired while waiting on socket fd 4
.timeout (36 s) expired while waiting on socket fd 4
.timeout (57 s) expired while waiting on socket fd 4
.
Your interactive job 6043392 has been successfully scheduled.
Establishing /usr/bin/qlogin_wrapper session to host ccwige0001.in2p3.fr ...
The authenticity of host '[ccwige0001.in2p3.fr]:54508 ([134.158.48.21]:54508)' can't be established.
RSA key fingerprint is bb:d7:79:14:08:d5:b1:12:28:a8:84:2c:4e:96:94:33.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[ccwige0001.in2p3.fr]:54508,[134.158.48.21]:54508' (RSA) to the list of known hosts.
sloikkan@ccwige0001.in2p3.fr's password: 

Once connected, you have access to the local directory corresponding to your jobid:

> cd /scratch/6043392.1.interactive/ 

To terminate the session, type:

> exit

The session will also ends if the cputime is reached.

There are two kind of interactive queues. The queue “interactive” is the default one. The queue “mc_interactive” is relevant for multi-core jobs and must be declared with “-q”:

> qlogin -pe multicores <number_of_cores> -q mc_interactive

See also multi-cores jobs below.

Multicore jobs

To run a job using several cores on a single machine, use multicores environment and define the needed queue with:

-pe multicores <number_of_cores> -q <QueueName> 

The available queues for multi-core jobs are named “mc_long”, “mc_huge” and “mc_longlasting”. Access has to be specifically asked for.
Jobs must use the quantity of cores that has been requested. On the worker node, the number of cores is accessible through the variable NSLOTS.

Parallel jobs

These jobs must be submitted on CentOS 7 only. You can connect to the interactive nodes cca.in2p3.fr in order to compile your code. Before submitting a parallel job, you must compile your code with specific library, OpenMPI or MPICH, and use respectively /usr/lib64/openmpi or /usr/lib64/mpich.

In the qsub command:
- The parallel environment must be defined (openmpi or mpich2), together with the number of cores with the option “-pe”:

-pe <pe_environment> <number_of_cores>

For example:

-pe openmpi 16


- Specify the needed queue wth the option “-q”:

-q <queuename>   #  pa_medium | pa_long 

Note that for parallel queues, cpu and memory must be given per core.
The output of a parallel job is directly written in the chosen directory during the job execution.
If you use the OpenMPI or MPICH library, a file named .mpd.conf must be created in $HOME with the permissions 600 ( -rw——- ) and with the following content:

# $HOME/.mpd.conf
Secretword = XXXX    # where XXXX is a secret word


- If you use OpenMPI:
Submit your job with:

-l os=cl7 -pe openmpi <number_of_cores> -q <QueueName>  

And include in your script:

 source /usr/local/shared/bin/openmpi_env.sh

 mpiexec -n $NSLOTS phello

Compilation is done with:

 mpicc -o phello phello.c 

Example of job submission:

 qsub -l os=cl7 -cwd -pe openmpi 16 -q pa_long phello.script 


- If you use MPICH:
Submit your job with:

-l os=cl7 -pe mpich2 <number_of_cores>  -q <QueueName>  

And include in your script:

export MPICH_HOME=/usr/lib64/mpich
export PATH=${MPICH_HOME}/bin:${PATH}
export MANPATH=${MPICH_HOME}/man:${MANPATH}
MPIEXEC="/usr/lib64/mpich/bin/mpiexec"
${MPIEXEC} -iface ib0 -np $NSLOTS phello 

Compilation your code with:

> mpicc -o phello phello.c

Example of job submission:

> qsub -l os=cl7 -cwd -pe mpich2 16 -q pa_long phello.script



Array jobs

When one needs to submit and manage a large number of similar jobs, it can be useful to use array jobs, in particular when the same script needs to run with different arguments or different set of data. An array job is a job which consist of several tasks, each task behaving like independent jobs. In this case, the same script is to be run several times, the only difference between each run is the index number associated with the tasks. The index numbers will be exported to the jobs via the environment variable $SGE_TASK_ID.

To submit an array job to GE, the command to use is qsub with the option “ -t ”:

> qsub -t min[-max[:interval]]

For example :

> qsub -t 1-10

The option arguments min , max and interval will be available through the environment variables $SGE_TASK_FIRST, $SGE_TASK_LAST and $SGE_TASK_STEPSIZE, respectively. But note, that is nothing to do with the execution order.
An example of a script array_job.sh that reads data from four different files ( input[1-4].txt ) and says hello to all items in the files 1 and 3:

#!/bin/sh

### Merge stdout and stderr in a single file
#$ -j y

### Define index numbers 'min-max:interval'
#$ -t 1-4:2

### Name the output script
#$ -N hello_items

echo "Task id = $SGE_TASK_ID"

if [ -e input$SGE_TASK_ID.txt ]; then
while read file
do
echo "hello $file"
done < input$SGE_TASK_ID.txt
fi 

Les fichiers input (input[1-4].txt) :

# input1.txt
world
moon 

# input2.txt
forest
sun

# input3.txt
monkey
people

# input4.txt
banan
apple

To submit the job, type (options are given at the end of the script):

> qsub array_job.sh

Command will return:

Your job-array 1331.1-4:2 ("hello_items") has been submitted

You will obtain two result files:

> cat hello_items.o1331.1 
...
hello world
hello moon
...

> cat hello_items.o1331.3
...
hello monkey
hello people
...

GPU jobs

Modify the resources of a submitted job: qalter

The qalter command allows to modify the resources requested by jobs that are in queue.

  • To modify a job resource, for instance the requested memory:
> qalter -mods l_hard s_rss 2G <jobid>
  • To add a given resource to a pending job (adding sps=1)
> qalter -adds l_hard sps 1 <jobid>
  • To remove a given resource from a pending job (removing hpss=1)
> qalter -clears l_hard hpss <jobid>
  • To remove all resources from a pending job :
> qalter -clearp l_hard <jobid>
  • To replace the list of all “Hard Resources” by a new list :
> qalter -l resource=value[,resource2=value2,resource3=value3] <jobid>

Type “man qalter” for other options like change of queue, of email, …

Suspend and release a job: qhold, qrls

You can suspend one or several submitted jobs with the qhold command:

> qhold <jobid>

In order to release the suspended job, use the command qrls:

> qrls <jobid>

There are a few environment variables that can be used in your scripts. A list is given below, for more details see

> man qsub


variable explanation
$HOME HOME directory of the user
$USER User name
$JOB_ID Job ID, a unique job number assigned by the batch system when the job is submitted
$JOB_NAMEName of the job, taken from the qsub script. Can be replaced with the “-N” option
$HOSTNAMEName of the execution host
$TASK_ID Index of a task in a array job
$NSLOTS The number of slots used by a multi-core or parallel job
$NHOSTS The numer of hosts used by a parallel job
$QUEUE The name of the queue in which the job is running
$TMPDIR The temporary directory of the job

The wallclock time of a job is the time elapsed between the beginning and the end of the job. The CPU time is the time spent using the CPU during the job. This CPU time can be lower than the wallclock in the case of an I/O intensive job (inefficient job). It can be higher than the wallcock for a multi-core job (the CPU time is multiplied by the number of cores that are used).

At CC-IN2P3, CPU time are often expressed in HS06 time unit. The conversion is done by multiplying the “physical” time by a factor that depends on the power of the core. For example a wallclock of 3h, on a core with a HS06 factor of 11 HS06, leads to: CPU (HS06.hours) = 3 (hours) * 11 (HS06) = 33 HS06.hours.

In the jobs stdout, the line “cpu (HS06)” indicates the CPU time in time(days:hours:minutes:seconds).HS06, the “cpu scaling factor” corresponds to the HS06 factor of the core, and the “cpu time” indicates the CPU time in seconds.
In the result of a qacct command, the “wallclock” corresponds obviously to the wallclock time in seconds, and the “cpu” is the CPU time in HS06.seconds.

  • en/utiliser_le_systeme_batch_ge_depuis_le_centre_de_calcul.txt
  • Last modified: 2018/11/29 11:06
  • by Gino MARCHETTI