Differences

This shows you the differences between two versions of the page.

Link to this comparison view

en:distributed_task_manager [2016/12/16 10:16] (current)
Line 1: Line 1:
 +Last modified: Oct 21, 2015 by Calvat\\
 +\\
 +
 +====== Distributed Task Manager ======
 +
 +\\
 +\\
 +
 +=====  Introduction ​ =====
 +
 +The Distributed Task Manager (DTM) is an user oriented virtualization system where users have he perception of using a uniform, simplified and powerful computing system.\\
 +\\
 +With DTM, users can execute sets of tasks in automatic and transparent ways across several heterogeneous distributed computing infrastructures,​ such as batch clusters, grids.\\
 +\\
 +DTM uses a pull scheduling approach to execute tasks in a distributed way via agents (a kind of job pilots) that hide the infrastructure heterogeneity.\\
 +\\
 +DTM can be used to submit and monitor jobs through TORQUE and SGE batch scheduler, Grid (gLite) and local Linux/Unix host (simultaneously on one, several or all them).
 +=====  Components ​ =====
 +
 +The DTM is composed of four main components: User tasks, agents, managers and a database.
 +<​code>​
 +.         DTM database
 +          /         \
 +         / ​          \
 +        /             \
 + User Tasks ------- DTM manager
 +        \             /
 +         ​\ ​          /
 +          \         /
 +           DTM Agent
 +
 +</​code>​
 +
 +====  User Tasks  ====
 +
 +A task is assumed for each action that a user wishes to perform on a computing machine. It could be an executable file, a set of commands, or a shell script. The task is built by the user and contains the business logic for the user application. ​
 +====  DTM Agent  ====
 +
 +Agents are lightweight and generic software components which run as independent jobs to execute\\
 +one or several user tasks. Agents can run on a worker node of a site controlled by a batch system. They employ a pull job scheduling paradigm. Agents can alternatively run as part of a job executed on a worker node as a so called “pilot agent”. ​
 +====  DTM Manager ​ ====
 +
 +The manager submits automatically job agents across the different infrastructures and monitor their behaviour. ​
 +====  DTM Database ​ ====
 +
 +The DTM information system is managed with a DBMS. The database provides a central catalog for the DTM. A relational data model specifies information and relationships about the system components.
 +=====  Setting environment ​ =====
 +
 +DTM is available in the directory:
 +<​code>​
 +/​afs/​in2p3.fr/​grid/​toolkit/​vo.rhone-alpes.idgrilles.fr/​dtm
 +</​code>​
 +Set the DTM_PATH environment variable:
 +<​code>​
 +export DTM_PATH=/​afs/​in2p3.fr/​grid/​toolkit/​vo.rhone-alpes.idgrilles.fr/​software/​dtm
 +or
 +setenv DTM_PATH /​afs/​in2p3.fr/​grid/​toolkit/​vo.rhone-alpes.idgrilles.fr/​software/​dtm
 +</​code>​
 +Update your PATH variable:
 +<​code>​
 +PATH=$PATH:​$DTM_PATH/​bin
 +or
 +set path=($path $DTM_PATH/​bin)
 +</​code>​
 +=====  Usage  =====
 +
 +DTM usage can be described for two main operations:
 +<​code>​
 +Register and enable tasks
 +</​code>​
 +<​code>​
 +Start or perform tasks 
 +</​code>​
 +=====  Tutorial ​ =====
 +
 +
 +====  Get a DTM identifier ​ ====
 +
 +Request a DTM identifier to DTM administrator. ​
 +====  Create a connection environment ​ ====
 +
 +This action permits an automatic authentication using your DTM identifier. It is required only one time. Use the command dtm-init, you can type:
 +<​code>​
 +> dtm-init -h 
 +</​code>​
 +that gives
 +<​code>​
 +usage: dtm-init -u username -e password [-h help] [-v version]
 +</​code>​
 +Example:
 +<​code>​
 +> dtm-init -u <​your_DTM_identifier>​
 +</​code>​
 +then it asks the your DTM password ​
 +====  Create task  ====
 +
 +For creating a task you must use the command dtm-task-add. This command has required and optional options. To obtain more informations about the options of dtm-task-add,​ you can type:
 +<​code>​
 +> dtm-task-add -h
 +</​code>​
 +that gives
 +<​code>​
 +usage: dtm-task-add -c cputime -m memory -p production -s script -t taskname
 +     [-n manager] [-a scriptarg] [-d dependency] [-b tmpbatch] [-v version][-h help]
 +
 +
 +REQUIRED OPTIONS:
 + ​-c,​--cputime <​cputime> ​        Max cpu time for this task in HS06 seconds
 + ​-m,​--memory <​memory> ​          Max memory for this task in Megabyes
 + ​-p,​--production <​production> ​  ​Production name
 + ​-s,​--script <​script> ​          ​Script file to be executed
 + ​-t,​--taskname <task name> ​     Task name must be unique for a given
 +                                production
 +
 +OPTIONAL OPTIONS:
 + ​-n,​--manager <manager name> ​   Manager name
 + ​-a,​--scriptarg <​scriptarg> ​    ​Quoted args for the script be executed ​
 + ​-d,​--dependency <​dependency> ​  ​Production name dependency
 + ​-b,​--tmpbatch <​tmpbatch> ​      Max tmpbatch size for this task
 +
 +</​code>​
 +Units for the options:
 +^dtm-add-task^unit ​       ^
 +|cputime ​    |HS06 seconds|
 +|memory ​     |MB          |
 +|tmpbatch ​   |MB          |
 +
 +Example:
 +<​code>​
 +> dtm-task-add -t task2 -c 100 -m 500 -p test -s /​afs/​in2p3.fr/​home/​u/​user/​helloworld.sh -a "hello world"
 +</​code>​
 +
 +====  List productions ​ ====
 +
 +<​code>​
 +> dtm-prod-list
 +</​code>​
 +
 +====  List tasks  ====
 +
 +<​code>​
 +> dtm-task-list -h
 +</​code>​
 +
 +====  Enable a production ​ ====
 +
 +Enabling a production permits to DTM to process registered tasks. You can create a production and enable it later.
 +<​code>​
 +> dtm-alter -a ENABLE -p <your production name>
 +</​code>​
 +
 +====  Start production ​ ====
 +
 +DTM establishes the number of agents (jobs) necessary to execute the tasks of the production (by adding the cpu time request by each task). The jobs agents are submitted automatically.
 +<​code>​
 +> dtm-start -h
 +</​code>​
 +that gives
 +<​code>​
 +usage: dtm-start -i infrastructure ​ [-n manager] [-p production]
 +    [-s sites] [-j max jobs] [-f monitoring-frequency] [-b batch job arguments]
 +    [-r remained mode] [-h help] [-v version]
 +
 +REQUIRED OPTIONS:
 + ​-i,​--infrastructure <​infrastructure> ​ Infrastructure:​ LOCAL,​TORQUE,​SGE,​GRID,​ALL
 +
 +OPTIONAL OPTIONS:
 + ​-n,​--manager <manager name> ​   Manager name
 + ​-p,​--production <​production> ​  ​Production name
 + ​-s,​--sites <​sites> ​            Max number of sites (for GRID)
 + ​-j,​--max jobs <​jobs> ​          Max number of jobs
 + ​-f,​--monitoring-frequency ​     Monitor frequency to check production in minutes
 + ​-m,​--submission mode           ​Modes:​ STRICT, PROGRESSIVE(default)
 + ​-r,​--remained mode          Agent jobs remain in execution (job pilot mode)
 + -b --batch job arguments ​      ​Aditional batch job arguments for batch system
 +
 +</​code>​
 +Example:
 +<​code>​
 +> dtm-start -p test -i SGE 
 +</​code>​
 +
 +===  Using Aditional batch job arguments in SGE  ===
 +
 +You can use the "​-b"​ option of dtm-start to include additional SGE arguments like as resources.\\
 +\\
 +Example:
 +<​code>​
 +> dtm-start -p test -i SGE -b "-l sps=1,​ct=01:​40:​00"​
 +</​code>​
 +
 +===  Specifying DTM jobs agents in SGE  ===
 +
 +If you specify the SGE class (option -q ) or CPU time (option -l ct=HH:​MM:​SS) DTM\\
 +does not establish the number of jobs agents (jobs SGE) necessary to execute the tasks of the production. In this case you must specify ALL SGE arguments to execute the jobs agents.\\
 +\\
 +Example:
 +<​code>​
 +> dtm-start -p test -i SGE -b "​-l ​ sps=1,​ct=01:​40:​00 -q G"
 +</​code>​
 +
 +====  Display Global Reports ​ ====
 +
 +<​code>​
 +> dtm-manager -h 
 +</​code>​
 +that gives
 +<​code>​
 +usage: dtm-manager -i infrastructure ​ [-n manager] [-p production]
 +       [-s sites] [-j max jobs] [-f monitoring-frequency] [-b batch job arguments]
 +       [-r remained mode] [-h help] [-v version]
 +
 +REQUIRED OPTIONS:
 + ​-i,​--infrastructure <​infrastructure> ​ Infrastructure:​ LOCAL,​TORQUE,​SGE,​GRID,​ALL
 +
 +OPTIONAL OPTIONS:
 + ​-n,​--manager <manager name> ​   Manager name
 + ​-p,​--production <​production> ​  ​Production name
 + ​-s,​--sites <​sites> ​            Max number of sites (for GRID)
 + ​-j,​--max jobs <​jobs> ​          Max number of jobs
 + ​-f,​--monitoring-frequency ​     Monitor frequency to check production in minutes
 + ​-m,​--submission mode           ​Modes:​ STRICT, PROGRESSIVE(default)
 + ​-r,​--remained mode       Agent jobs remain in execution (job pilot mode)
 + -b --batch job arguments ​      ​Aditional batch job arguments for batch system
 +</​code>​
 +Example:
 +<​code>​
 +> dtm-manager ​ -i SGE -p test 
 +</​code>​
 +
 +====  Delete productions and tasks  ====
 +
 +<​code>​
 +> dtm-cancel -h 
 +</​code>​
 +that gives
 +<​code>​
 +usage: dtm-cancel -n manager -p production -i infrastructure ​
 +       [-t task | -m | -g | -s | -a ] [-h help] [-v version]
 +
 +
 +REQUIRED OPTIONS:
 + ​-n,​--manager <manager name> ​   Manager name
 + ​-p,​--production <​production> ​  ​Production name
 + -i, <​infrastructure> ​          ​Infrastructure:​ LOCAL, SGE, TORQUE GRID, ALL
 + -t <​task>​ | [ -m ] [ -g ] [ -s ]  [ -a ] 
 +
 +OPTIONAL OPTIONS:
 + ​-t,​--taskname <​taskname> ​      ​Cancel a task
 + ​-m,​--jobs managers ​            ​Cancel all jobs managers
 + ​-g,​--jobs agents ​              ​Cancel all jobs agents
 + ​-s,​--tasks ​                    ​Cancel all tasks
 + ​-a,​--all ​                      ​Cancel all tasks, job agents and managers
 +
 +</​code>​
 +Example:
 +<​code>​
 +> dtm-cancel -p test -i SGE -a 
 +</​code>​
 +=====  Using DTM in Grid  =====
 +
 +DTM can be used to submit jobs in grids. You need a grid certificate and to be member of a virtual organization (VO). 
 +====  Create a grid proxy  ====
 +
 +This action permits to create a proxy with VOMS extensions that will be used for the dtm-manager to submit jobs in grid. Use this command each time that your proxy is expired, you can type:
 +<​code>​
 +> dtm-proxy-init -h 
 +</​code>​
 +that gives
 +<​code>​
 +usage: dtm-proxy-init -vo <​vo_name> ​
 +
 +REQUIRED OPTIONS:
 + ​-vo,<​vo_name> ​ virtual organisation name
 +</​code>​
 +Example:
 +<​code>​
 +> dtm-proxy-init -vo vo.tidra.org
 +Enter GRID pass phrase:
 +Your identity: /​O=GRID-FR/​C=FR/​O=CNRS/​OU=CC-IN2P3/​CN=Foo Bar
 +Creating temporary proxy ......... Done
 +</​code>​
 +=====  Configuration and Customization ​ =====
 +
 +[[:​en:​dtm_configuration_and_customization|DTM Configuration and Customization]]
 +=====  Publications ​ =====
 +
 +**DTM a lightweight computing virtualization system based on iRODS** Y. Cardenas, P. Calvat, JY. Nief, T. Kachelhoffer. //iRODS User Group Meeting 2012// , University of Arizona, Tucson,AZ, USA. March 1-2, 2012
 +=====  Technical Implementation Documentation ​ =====
 +
 +[[:​en:​dtm_technical_implementation|Technical Implementation]]
 +=====  Terms of Use  =====
 +
 +Currently, DTM is a software prototype, it is not an official service or product of the CC-IN2P3. Therefore, the CC-IN2P3 does not provide support for DTM.
 +
  
  • en/distributed_task_manager.txt
  • Last modified: 2016/12/16 10:16
  • (external edit)