GE : list of commands



Here is a list of the main GE commands. These commands are sorted into three categories:

  • commands for submission and monitoring of a task.
  • commands to change the status of a task, its characteristics and behavior.
  • global commands on the computing farm.

qsub

This command lets you submit a job to the GE batch system. See the documentation for the submission of a job to GE.

> qsub test.sh
Your job 1264 ("test.sh") has been submitted

qstat

This command gives the status your running tasks.

qstat -s p|r|z -j jobid1,jobid2,...           // pending,running,zombie

Examples :
- for running jobs :

> qstat -s r
job-ID  prior  name  user  state  submit/start   at  queue  slots ja-task-ID 
---------------------------------------------------------------------------- 
1210  0.55500  test  santaklaus   r   07/19/2010 14:05:44  ge.q@...   1        
1211  0.55500  test  santaklaus   r   07/19/2010 14:05:46  ge.q@...   1        
1212  0.55500  test  santaklaus   r   07/19/2010 14:05:47  ge.q@...   1         
      

- for terminated jobs :

> qstat -s z
> qstat -s z
job-ID  prior  name  user  state  submit/start   at  queue  slots ja-task-ID 
---------------------------------------------------------------------------- 
  1209 0.00000 test santaklaus      qw    07/19/2010 14:01:08           1     

- for more information on running jobs :

> qstat -j 1210,1213
==============================================================
job_number:                 1210
exec_file:                  job_scripts/1210
submission_time:            Mon Jul 19 14:05:44 2010
owner:                      santaklaus
uid:                        4628
group:                      ccin2p3
gid:                        102
sge_o_home:                 /afs/in2p3.fr/home/l/santaklaus
sge_o_log_name:             santaklaus
sge_o_path:                
/opt/sge6.2u5/bin/lx24-amd64:/usr/afsws/bin:/afs/in2p3.fr/throng/ccin2p3/scripts:/opt/sge6.2u5/bin/lx24-amd64:/opt/sge6.2u5/utilbin/lx24-amd64:/opt/sge6.2u5/bin/lx24-amd64:/opt/sge6.2u5/util:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
sge_o_shell:                /usr/local/bin/tcsh
sge_o_workdir:              /afs/test.in2p3.fr/home/l/santaklaus/test1
sge_o_host:                 ccalige01
account:                    sge
cwd:                        /afs/in2p3.fr/home/l/santaklaus/test1
mail_options:               bes
mail_list:                  rachid.santaklaus@cc.in2p3.fr
notify:                     FALSE
job_name:                   test
jobshare:                   0
hard_queue_list:            learn_ge.q
env_list:                   
script_file:                test.sh
usage    1:                 cpu=00:00:00, mem=0.00008 GBs, io=0.00930, vmem=12.180M, maxvmem=12.180M
scheduling info:            ...
==============================================================
job_number:                 1213
exec_file:                  job_scripts/1213
...

qacct

The qacct command provides informations on completed tasks. By default, the command allows to access data about jobs that ended in the last 5 days.

In order to have informations on a specific jobs completed in the last 5 days, type:

> qacct -j <jobid>

To access data about older jobs, use the option “-f”:

> qacct -o <loginname> -j -f /opt/sge/ccin2p3/common/accounting.YYYY.MM

For instance, in order to see informations on all your jobs in the last 10 days:

> qacct -o <loginname> -j -d 10 -f /opt/sge/ccin2p3/common/accounting.YYYY.MM

Note: the CPU time is expressed in HS06.secondes. See the section on the CPU consumption of jobs for more details.


The “exit_status” reported by qacct is equal to 0 if the job as completed successfully. If a soft limit has been exceeded, the following error codes are returned;

exit_status corresponds to
152 = 24 (SIGXCPU) + 128SIGXCPU : exceeded cpu time ( _cpu ) or memory ( _rss )
138 = 10 (SIGUSR1) + 128SIGUSR1 : exceeded elapsed time ( _rt )
153 = 25 (SIGXFSZ) + 128SIGXFSZ : exceeded file size ( _fsize )

If a hard limit has been exceeded, a signal SIGKILL 137 = 9 (SIGKILL) + 128 is sent

qdel

This command is used to delete a task.

> qdel 1214,1215
santaklaus has registered the job 1214 for deletion
santaklaus has registered the job 1215 for deletion

qhold

This command allows you to change the status of a task in the queue and put it on hold. This means that the task will not be taken into account when scheduling tasks for execution on a machine of the farm.

> qhold 1290
modified hold of job 1290  

qrls

This command allows you to remove the hold status and allow its release by the GE scheduler for its execution. This command can, for example, be executed at the end of a task to unlock another task that can be executed only after the end of the first.

> qrls 1290
modified hold of job 1290

qalter

The qalter command lets you change/add/remove a given resource for a pending job.

Examples :

- To modify the requested memory of a pending job (setting s_vmem=2G)

> qalter -mods l_hard s_vmem 2G <jobid>

- To add a given resource to a pending job (adding sps=1)

> qalter -adds l_hard sps 1 <jobid>

- To remove a given resource from a pending job (removing hpss=1)

> qalter -clears l_hard hpss <jobid>

- To remove all resources from a pending job :

> qalter -clearp l_hard <jobid>

- To replace the list of all “Hard Resources” by a new list :

> qalter -l s_vmem=2G,sps=1,hpss=1 <jobid>

- To change the queue of a pending job

> qalter -q <new_queue_name> <jobid>

Type “man qalter” for other options like change of queue, of email, …

qmod

This command send a signal to a running job :
qmod -sj | -usf | -cd (suspend | unsuspend | clear error)

> qmod -sj 1277
santaklaus - suspended job 1277
> qmod -usj 1277
santaklaus - unsuspended job 1277

qresub

This command allows GE to submit copies of a (pending or running) job. The new job will be identical but with a new job ID.

> qresub 1277
Your job 1278 ("test.sh") has been submitted

qacct

In addition to details on jobs, the acct command provides the consumptions of wall-clock time, cpu-time, and system time for given hostname, queue-name, group-name, owner-name, job-name, job-ID.
The consumptions can be restricted to the queues meeting the resource requirements as specified with the -l switch.

> qacct -l hpss=1
Total System Usage
WALLCLOCK   UTIME   STIME     CPU       MEMORY       IO          IOW
=========================================================================
826         0.030   0.057     0.087     0.000        0.001       0.000

qquota

qquota shows the current Grid Engine resource quota sets.

qhost

qhost shows the current status of the available Grid Engine hosts, queues and the jobs associated with the queues.

  • en/ge_list_of_commands.txt
  • Last modified: 2018/02/09 11:45
  • by Jean-René ROUET