wiki:GridWay

GridWay is an open-source component for meta-scheduling in the Grid. GridWay gives end users, application developers and managers of Grid infrastructures a scheduling functionality. It is completely functional on GISELA, being able to interface with the computing, file transferring and information services available within the GISELA infrastructure. GridWay does not intend to substitute the resource brokers available in the Gisela distribution, but to provide a meta-scheduling alternative with greater functionality and higher performance for given application profiles.

Installation on an UI

Required Middleware

The following middlewares should be installed to use the corresponding drivers:

  • Globus Toolkit 4 or 5
  • gLite UI 3.1 (GRAM2-based)
  • gLite UI 3.2 (CREAM-based)

Download

Our binary GridWay-5.8 version has GT2, CREAM and DRM4G drivers for x86_64 architecture

  1. Download GridWay (i. e. on your HOME directory) compatible with Linux Kernel 2.6.x or superior:
    [user@ui~]$ wget http://www.meteo.unican.es/work/DRM4G/drm4g_gridway_x86_64_r1288.zip
    
  1. Unzip the distribution file:
    [user@ui~]$ unzip drm4g_gridway_x86_64_r1288.zip
    

Environment Configuration

Set up the environment variables "GW_LOCATION" and "PATH" for GridWay.

[user@ui~]$ export GW_LOCATION=$HOME/drm4g_gridway
[user@ui~]$ export PATH=$GW_LOCATION/bin:$PATH

And set up the enviroment for GISELA:

  • Create a file (i.e. gisela_environment.sh) with these variables:
    export LCG_CATALOG_TYPE=lfc
    export LFC_HOST=lfc.eela.ufrj.br
    export LCG_GFAL_INFOSYS=bdii.eela.ufrj.br:2170
    export LFC_HOME=/grid/prod.vo.eu-eela.eu
    export VO="prod.vo.eu-eela.eu"
    
  • Execute commands from gisela_environment.sh in the current shell environment.
    [user@ui~]$ source gisela_environment.sh
    

Available Resources

lcg-infosites command can be used to obtain VO information on Grid resources. Before using lcg-infosites you have to execute gisela_environment.sh file for GISELA Infrastucture. Some using examples are showed below:

  • Find out the CEs of your VO:
    [user@ui~]$ lcg-infosites --vo $VO ce 
    #CPU    Free    Total Jobs      Running Waiting ComputingElement
    ----------------------------------------------------------
      16      16       0              0        0    gantt.cefet-rj.br:8443/cream-pbs-prod
    21561      0    3409              7     3402    ce206.cern.ch:8443/cream-lsf-grid_eela
    21561    826    3409              7     3402    ce204.cern.ch:8443/cream-lsf-grid_eela
    21561    826    3409              7     3402    ce203.cern.ch:8443/cream-lsf-grid_eela
    21561      0    3409              7     3402    ce205.cern.ch:8443/cream-lsf-grid_eela
    21561    826    3409              7     3402    ce208.cern.ch:8443/cream-lsf-grid_eela
    21561      0    3409              7     3402    ce207.cern.ch:8443/cream-lsf-grid_eela
    21561    826    3409              7     3402    ce130.cern.ch:2119/jobmanager-lcglsf-grid_eela
    21561    826    3409              7     3402    ce132.cern.ch:2119/jobmanager-lcglsf-grid_eela
    21561    826    3409              7     3402    ce131.cern.ch:2119/jobmanager-lcglsf-grid_eela
    21561    826    3409              7     3402    ce133.cern.ch:2119/jobmanager-lcglsf-grid_eela
     260     107       4              4        0    ce01-tic.ciemat.es:2119/jobmanager-lcgpbs-prod_eela
    1160     467       0              0        0    gridgate.cs.tcd.ie:2119/jobmanager-pbs-sixhour
    1160     467       0              0        0    gridgate.cs.tcd.ie:2119/jobmanager-pbs-thirtym
    1160     467       5              4        1    gridgate.cs.tcd.ie:2119/jobmanager-pbs-threeday
    1160     467       2              2        0    gridgate.cs.tcd.ie:2119/jobmanager-pbs-oneday
      10      10       0              0        0    ce01.unlp.edu.ar:2119/jobmanager-lcgpbs-long
    ...........
    
  • Find out CE list with running jobs, free cpus, and maximum wallclock and CPU time.
    [user@ui~]$ lcg-info --vo $VO --list-ce --attrs RunningJobs,FreeCPUs,MaxWCTime,MaxCPUTime
    - CE: axon-g01.ieeta.pt:2119/jobmanager-lcgpbs-prod
      - RunningJobs         0
      - FreeCPUs            5
      - MaxWCTime           4320
      - MaxCPUTime          2880
    
    - CE: cale.uniandes.edu.co:8443/cream-pbs-prod
      - RunningJobs         3
      - FreeCPUs            94
      - MaxWCTime           4320
      - MaxCPUTime          2880
    ...........
    
  • Find out the SEs of your VO.
    [user@ui~]$ lcg-infosites --vo $VO se
    Avail Space(Kb) Used Space(Kb)  Type    SEs
    ----------------------------------------------------------
    1258363960      8651392         n.a     se.labmc.inf.utfsm.cl
    288012854       11517683563     n.a     lnx097.eela.if.ufrj.br
    187037782       27605724        n.a     se01.macc.unican.es
    

For more information you could execute lcg-infosites --help or see gLite information

Configuration of GridWay to access to GISELA Resources

Next steps describe an specific configuration of the drivers for GISELA infrastructure.

In file "$GW_LOCATION/etc/gwd.conf":

# Example MAD Configuration for GISELA
# GT2 
IM_MAD = gisela_gt2:gw_im_mad_mds2_glue-bdii:-l etc/gt2.list -q (GlueCEAccessControlBaseRule=VO\:prod.vo.eu-eela.eu) -s bdii.eela.ufrj.br:tm_gt2:em_gt2
EM_MAD = em_gt2:gw_em_mad_gram2::rsl_nsh
TM_MAD = tm_gt2:gw_tm_mad_dummy:-u gsiftp\://ui01.macc.unican.es

# CREAM
IM_MAD = glisela_cream:gw_im_mad_mds2_glue-bdii:-l etc/cream.list -q (GlueCEAccessControlBaseRule=VO\:prod.vo.eu-eela.eu) -s bdii.eela.ufrj.br:tm_cream:em_cream
EM_MAD = em_cream:gw_em_mad_cream::jdl
TM_MAD = tm_cream:gw_tm_mad_dummy:-g

gw_im_mad_mds2_glue-bdii is only available in our binary GridWay-5.8 version. If you want to use a different drivers, you have to visit GridWay.org

There are three options for the configuration of the IM MAD:

  • -l: host list file to be used by GridWay.
    • Example of gt2.list for GISELA:
      [user@ui~]$ lcg-infosites --vo $VO ce | awk 'NR>2 {print $6}'|grep jobmanager |awk -F ":" '{print $1}' | sort | uniq > $GW_LOCATION/etc/gt2.list
      [user@ui~]$ cat $GW_LOCATION/etc/gt2.list
      ce01-tic.ciemat.es
      ce01.unlp.edu.ar
      ce.labmc.inf.utfsm.cl
      tochtli.nucleares.unam.mx
      grid012.ct.infn.it
      ce01.eela.if.ufrj.br
      ce.cp.di.uminho.pt
      ce01.macc.unican.es
      ce01.up.pt
      grid001.fe.up.pt
      
    • Example of cream.list for GISELA:
      [user@ui~]$ lcg-infosites --vo $VO ce | awk 'NR>2 {print $6}'|grep cream |awk -F ":" '{print $1}' | sort | uniq > $GW_LOCATION/etc/cream.list
      [user@ui~]$ cat $GW_LOCATION/etc/cream.list
      gantt.cefet-rj.br
      ce206.cern.ch
      ce204.cern.ch
      ce205.cern.ch
      ce207.cern.ch
      ce208.cern.ch
      tochtli64.nucleares.unam.mx
      ce02.eela.if.ufrj.br
      cream01.cecalc.ula.ve
      ce.egee.di.uminho.pt
      cale.uniandes.edu.co
      grid001.fc.up.pt
      
  • -q: it is possible to configure a GridWay instance to only use queues authorized to your VO by filtering them.
  • -s: information server in a hierarchical configuration.
    IM_MAD = glisela_gt2:gw_im_mad_mds2_glue-bdii:-l etc/gt2.list -q (GlueCEAccessControlBaseRule=VO\:prod.vo.eu-eela.eu) -s bdii.eela.ufrj.br:tm_gt2:em_gt2
    

There are two options for the configuration of the TM MAD:

  • -g: starts a GASS server for each user.
    TM_MAD = tm_gt2:gw_tm_mad_dummy:-g
    
  • -u: specifies the URL of a GridFTP server running in the client. For example:
    TM_MAD = tm_gt2:gw_tm_mad_dummy:-u gsiftp\://ui01.macc.unican.es
    

Accessing the VOMS servers

To use the Gisela resources, the user should iniatilize the proxy through voms server:

[user@ui~]$ voms-proxy-init --voms prod.vo.eu-eela.eu
Cannot find file or dir: /oceano/gmeteo/users/carlos/.glite/vomses
Enter GRID pass phrase:
Your identity: /DC=es/DC=irisgrid/O=unican/CN=josecarlos.blanco
Creating temporary proxy ........................................ Done
Contacting  voms.eela.ufrj.br:15003 [/C=BR/O=ICPEDU/O=UFF BrGrid CA/O=UFRJ/OU=IF/CN=host/voms.eela.ufrj.br] "prod.vo.eu-eela.eu" Done
Creating proxy ................................... Done
Your proxy is valid until Tue Aug 23 22:15:06 2011

Quick Start Guide

By default, it includes a configuration to use GISELA infrastructure. Follow the steps below:

  1. Start GridWay. "GW_LOCATION" and "PATH" variables must be exported.
    [user@ui~]$ gwd
    
  2. Show information about all available resources. gwhost command needs some seconds to update the information:
    [user@ui~]$ gwhost
    HID PRIO OS              ARCH  MHZ  %CPU    MEM(F/T)     DISK(F/T)      N(U/F/T) LRMS               HOSTNAME                                             
    0   1    ScientificSLBor x86_6 3200    0   1024/1024           0/0      0/78/260 jobmanager-lcgpbs  ce01-tic.ciemat.es            
    1   1    ScientificSLBer i686  1865    0     900/900           0/0       0/10/10 jobmanager-lcgpbs  ce01.unlp.edu.ar              
    2   1    ScientificSLBer x86_6 1600    0   2048/2048           0/0     0/116/132 jobmanager-lcgpbs  ce.labmc.inf.utfsm.cl         
    3   1    ScientificSLBer i686  2400    0   3072/3072           0/0         0/4/4 jobmanager-lcgpbs  tochtli.nucleares.unam.mx     
    4   1    ScientificSLBer i686  2193    0   4096/4096           0/0      0/43/115 jobmanager-lcglsf  grid012.ct.infn.it            
    5   1    Scientific Linu x86_6 2000    0   8150/8150           0/0       0/17/48 cream-pbs          ce01.eela.if.ufrj.br          
    6   1    ScientificCERNS i386  2330    0     512/512           0/0       0/12/12 jobmanager-lcgpbs  ce.cp.di.uminho.pt            
    7   1    CentOSFinal     x86_6 2400    0 16000/16000           0/0     0/229/454 jobmanager-lcgpbs  ce01.macc.unican.es           
    8   1    ScientificSLSL  x86_6 2400    0   4058/4058           0/0       0/34/36 jobmanager-lcgsge  ce01.up.pt                    
    9   1    ScientificSLSL  x86_6 2400    0   4058/4058           0/0       0/22/22 jobmanager-lcgsge  grid001.fe.up.pt              
    10  1    ScientificSLBer i686  2330    0   2048/2048           0/0       0/18/18 cream-pbs          gantt.cefet-rj.br             
    11  1                             0    0         0/0           0/0     0/0/21818 cream-lsf          ce206.cern.ch                 
    12  1                             0    0         0/0           0/0   0/833/21818 cream-lsf          ce204.cern.ch                 
    13  1                             0    0         0/0           0/0     0/0/21818 cream-lsf          ce205.cern.ch                 
    14  1                             0    0         0/0           0/0     0/0/21818 cream-lsf          ce207.cern.ch                 
    15  1                             0    0         0/0           0/0   0/833/21818 cream-lsf          ce208.cern.ch                 
    16  1    CentOSFinal     x86_6 2670    0 12000/12000           0/0       0/25/40 cream-pbs          tochtli64.nucleares.unam.mx   
    17  1    Scientific Linu x86_6 2000    0   8178/8178           0/0      0/55/200 cream-pbs          ce02.eela.if.ufrj.br          
    18  1    ScientificSLBor x86_6 3000    0   2048/2048           0/0       0/24/24 cream-pbs          cream01.cecalc.ula.ve         
    19  1                             0    0         0/0           0/0         0/0/0                    ce.egee.di.uminho.pt          
    20  1    ScientificCERNS x86_6 1600    0   4096/4096           0/0     0/138/188 cream-pbs          cale.uniandes.edu.co          
    21  1    ScientificSLSL  x86_6 2400    0   4058/4058           0/0       0/20/22 cream-sge          grid001.fc.up.pt 
                         
    
  3. Prepare for job submission. "$GW_LOCATION/examples" contains templates to submit. Job templates allow you to configure job requirements, in terms of needed files, generated files, requirements and ranks of execution hosts. The next template is basically a Linux date command. Submit it to check GridWay.
    [user@ui~]$ cat $GW_LOCATION/examples/date/date.jt
    EXECUTABLE=/bin/date
    
  4. Submit the job. gwsubmit command can submit a job to available resources:
    [user@ui~]$ gwsubmit $GW_LOCATION/examples/date/date.jt
    
  5. Check the evolution of the job. gwps command reports current job status:
    [user@ui~]$ gwps
    USER         JID DM   EM   START    END      EXEC    XFER    EXIT NAME            HOST                                          
    user:0       0   pend ---- 19:39:09 --:--:-- 0:00:00 0:00:00 --   date.jt         --                        
    
    If you execute successive gwps, you can see the differents states of job:
    user:0       0   pend ---- 19:39:09 --:--:-- 0:00:00 0:00:00 --   date.jt         --                                            
    user:0       0   prol ---- 19:39:09 --:--:-- 0:00:00 0:00:00 --   date.jt         --                                                            
    user:0       0   wrap pend 19:39:09 --:--:-- 0:00:00 0:00:00 --   date.jt         ce01.macc.unican.es/jobmanager-lcgpbs                                                                         
    user:0       0   wrap actv 19:39:09 --:--:-- 0:00:05 0:00:00 --   date.jt         ce01.macc.unican.es/jobmanager-lcgpbs                                                         
    user:0       0   epil ---- 19:39:09 --:--:-- 0:00:10 0:00:00 --   date.jt         ce01.macc.unican.es/jobmanager-lcgpbs                                            
    user:0       0   done ---- 19:39:09 19:39:27 0:00:10 0:00:01 0    date.jt         ce01.macc.unican.es/jobmanager-lcgpbs         
    
    pend => he job is waiting for a resource to run on
    prol => the remote system is being prepared for execution.
    wrap pend => the job has been successfully submitted to the local DRM system and it is waiting for the local DRM system.
    wrap actv => the job is being executed by the local DRM system.
    epil => the job is finalizing
    done => the job has finished
  6. Results are standard output (stdout) and standard error (stderr), both files will be in the same directory of job template:
    [user@ui~]$ cat $GW_LOCATION/examples/date/stdout.0
    jue may  5 20:01:56 CEST 2011
    [user@ui~]$ cat $GW_LOCATION/examples/date/stderr.0
    

Support

If you have any problem or question, you should put a Ticket.

Last modified 10 years ago Last modified on Nov 21, 2011 10:15:09 AM

Attachments (5)