Version 1 (modified by valva, 11 years ago) (diff) |
---|
GridWay is an open-source component for meta-scheduling in the Grid. GridWay gives end users, application developers and managers of Grid infrastructures a scheduling functionality. It is completely functional on Gisela, being able to interface with the computing, file transferring and information services available within the Gisela infrastructure. GridWay does not intend to substitute the resource brokers available in the Gisela distribution, but to provide a meta-scheduling alternative with greater functionality and higher performance for given application profiles.
Installation on an UI
Required Middleware
The following middlewares should be installed to use the corresponding drivers:
- Globus Toolkit 4 or 5
- gLite UI 3.1 (GRAM2-based)
- gLite UI 3.2 (CREAM-based)
Download
Our GridWay-5.7 version makes changes to last development release of GridWay (see Appendix A)
- Download GridWay? (i. e. on your HOME directory) compatible with Linux Kernel 2.6.x or superior:
[user@ui~]$ wget http://www.meteo.unican.es/work/DRM4G/drm4g_gridway_x86_64_r1087.zip
- Unzip the distribution file:
[user@ui~]$ unzip drm4g_gridway_x86_64_r1087.zip
Environment Configuration
Set up the environment variables "GW_LOCATION" and "PATH" for GridWay.
[user@ui~]$ export GW_LOCATION=$HOME/drm4g_gridway [user@ui~]$ export PATH=$GW_LOCATION/bin:$PATH
And set up the enviroment for Gisela:
- Create a file (i.e. gilesa.sh) with these variables:
export LCG_CATALOG_TYPE=lfc export LFC_HOST=lfc.eela.ufrj.br export LCG_GFAL_INFOSYS=bdii.eela.ufrj.br:2170 export LFC_HOME=/grid/prod.vo.eu-eela.eu export VO="prod.vo.eu-eela.eu"
- Evaluate the file.
[user@ui~]$ source gilesa.sh
Available Resources
lcg-infosites command can be used to obtain VO information on Grid resources. Before using lcg-infosites you have to evaluate gilsela.sh file for Gisela Infrastucture. Some using examples are showed below:
- Find out the CEs of your VO:
[user@ui~]$ lcg-infosites --vo $VO ce #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 16 16 0 0 0 gantt.cefet-rj.br:8443/cream-pbs-prod 21561 0 3409 7 3402 ce206.cern.ch:8443/cream-lsf-grid_eela 21561 826 3409 7 3402 ce204.cern.ch:8443/cream-lsf-grid_eela 21561 826 3409 7 3402 ce203.cern.ch:8443/cream-lsf-grid_eela 21561 0 3409 7 3402 ce205.cern.ch:8443/cream-lsf-grid_eela 21561 826 3409 7 3402 ce208.cern.ch:8443/cream-lsf-grid_eela 21561 0 3409 7 3402 ce207.cern.ch:8443/cream-lsf-grid_eela 21561 826 3409 7 3402 ce130.cern.ch:2119/jobmanager-lcglsf-grid_eela 21561 826 3409 7 3402 ce132.cern.ch:2119/jobmanager-lcglsf-grid_eela 21561 826 3409 7 3402 ce131.cern.ch:2119/jobmanager-lcglsf-grid_eela 21561 826 3409 7 3402 ce133.cern.ch:2119/jobmanager-lcglsf-grid_eela 260 107 4 4 0 ce01-tic.ciemat.es:2119/jobmanager-lcgpbs-prod_eela 1160 467 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-sixhour 1160 467 0 0 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-thirtym 1160 467 5 4 1 gridgate.cs.tcd.ie:2119/jobmanager-pbs-threeday 1160 467 2 2 0 gridgate.cs.tcd.ie:2119/jobmanager-pbs-oneday 10 10 0 0 0 ce01.unlp.edu.ar:2119/jobmanager-lcgpbs-long ...........
- Find out CE list with running jobs, free cpus, and maximum wallclock and CPU time.
[user@ui~]$ lcg-info --vo $VO --list-ce --attrs RunningJobs,FreeCPUs,MaxWCTime,MaxCPUTime - CE: axon-g01.ieeta.pt:2119/jobmanager-lcgpbs-prod - RunningJobs 0 - FreeCPUs 5 - MaxWCTime 4320 - MaxCPUTime 2880 - CE: cale.uniandes.edu.co:8443/cream-pbs-prod - RunningJobs 3 - FreeCPUs 94 - MaxWCTime 4320 - MaxCPUTime 2880 ...........
- Find out the SEs of your VO.
[user@ui~]$ lcg-infosites --vo $VO se Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 1258363960 8651392 n.a se.labmc.inf.utfsm.cl 288012854 11517683563 n.a lnx097.eela.if.ufrj.br 187037782 27605724 n.a se01.macc.unican.es
For more information you could execute lcg-infosites --help or see gLite information
Configuration to access to Gisela Resources
Next steps describe an specific configuration of the drivers for Gisela infrastructure.
In file "$GW_LOCATION/etc/gwd.conf":
# Example GT2 IM_MAD = glisela_gt2:gw_im_mad_mds2_glue-bdii:-l etc/gt2.list -q (GlueCEAccessControlBaseRule=VO\:prod.vo.eu-eela.eu) -s bdii.eela.ufrj.br:tm_gt2:em_gt2 EM_MAD = em_gt2:gw_em_mad_gram2::rsl_nsh TM_MAD = tm_gt2:gw_tm_mad_dummy:-u gsiftp\://ui01.macc.unican.es # Example CREAM IM_MAD = glisela_cream:gw_im_mad_mds2_glue-bdii:-l etc/cream.list -q (GlueCEAccessControlBaseRule=VO\:prod.vo.eu-eela.eu) -s bdii.eela.ufrj.br:tm_cream:em_cream EM_MAD = em_cream:gw_em_mad_cream::jdl TM_MAD = tm_cream:gw_tm_mad_dummy:-g
There are three options for the configuration of the IM MAD:
- -l: host list file to be used by GridWay.
- Example of gt2.list for Gisela:
[user@ui~]$ lcg-infosites --vo $VO ce | awk 'NR>2 {print $6}'|grep jobmanager |awk -F ":" '{print $1}' | uniq ce01-tic.ciemat.es ce01.unlp.edu.ar ce.labmc.inf.utfsm.cl tochtli.nucleares.unam.mx grid012.ct.infn.it ce01.eela.if.ufrj.br ce.cp.di.uminho.pt ce01.macc.unican.es ce01.up.pt grid001.fe.up.pt
- Example of cream.list for Gisela:
[user@ui~]$ lcg-infosites --vo $VO ce | awk 'NR>2 {print $6}'|grep cream |awk -F ":" '{print $1}' | uniq gantt.cefet-rj.br ce206.cern.ch ce204.cern.ch ce205.cern.ch ce207.cern.ch ce208.cern.ch tochtli64.nucleares.unam.mx ce02.eela.if.ufrj.br cream01.cecalc.ula.ve ce.egee.di.uminho.pt cale.uniandes.edu.co grid001.fc.up.pt
- Example of gt2.list for Gisela:
- -q: it is possible to configure a GridWay instance to only use queues authorized to your VO by filtering them.
- -s: information server in a hierarchical configuration.
IM_MAD = glisela_gt2:gw_im_mad_mds2_glue-bdii:-l etc/gt2.list -q (GlueCEAccessControlBaseRule=VO\:prod.vo.eu-eela.eu) -s bdii.eela.ufrj.br:tm_gt2:em_gt2
There are two options for the configuration of the TM MAD:
- -g: starts a GASS server.
- -u: specifies the URL of a GridFTP server running in the client. For example:
TM_MAD = tm_gt2:gw_tm_mad_dummy:-u gsiftp\://ui01.macc.unican.es
Accessing the VOMS servers
To use the Gisela resources, the user should iniatilize the proxy through voms server:
[user@ui~]$ voms-proxy-init --voms prod.vo.eu-eela.eu Cannot find file or dir: /oceano/gmeteo/users/carlos/.glite/vomses Enter GRID pass phrase: Your identity: /DC=es/DC=irisgrid/O=unican/CN=josecarlos.blanco Creating temporary proxy ........................................ Done Contacting voms.eela.ufrj.br:15003 [/C=BR/O=ICPEDU/O=UFF BrGrid CA/O=UFRJ/OU=IF/CN=host/voms.eela.ufrj.br] "prod.vo.eu-eela.eu" Done Creating proxy ................................... Done Your proxy is valid until Tue Aug 23 22:15:06 2011
Quick Start Guide
By default, it includes a configuration to use Gisela infrastructure. Follow the steps below:
- Start GridWay. "GW_LOCATION" and "PATH" variables must be exported.
[user@ui~]$ gwd
- Show information about all available resources. gwhost command needs some seconds to update the information:
[user@ui~]$ gwhost HID PRIO OS ARCH MHZ %CPU MEM(F/T) DISK(F/T) N(U/F/T) LRMS HOSTNAME 0 1 ScientificSLBor x86_6 3200 0 1024/1024 0/0 0/78/260 jobmanager-lcgpbs ce01-tic.ciemat.es 1 1 ScientificSLBer i686 1865 0 900/900 0/0 0/10/10 jobmanager-lcgpbs ce01.unlp.edu.ar 2 1 ScientificSLBer x86_6 1600 0 2048/2048 0/0 0/116/132 jobmanager-lcgpbs ce.labmc.inf.utfsm.cl 3 1 ScientificSLBer i686 2400 0 3072/3072 0/0 0/4/4 jobmanager-lcgpbs tochtli.nucleares.unam.mx 4 1 ScientificSLBer i686 2193 0 4096/4096 0/0 0/43/115 jobmanager-lcglsf grid012.ct.infn.it 5 1 Scientific Linu x86_6 2000 0 8150/8150 0/0 0/17/48 cream-pbs ce01.eela.if.ufrj.br 6 1 ScientificCERNS i386 2330 0 512/512 0/0 0/12/12 jobmanager-lcgpbs ce.cp.di.uminho.pt 7 1 CentOSFinal x86_6 2400 0 16000/16000 0/0 0/229/454 jobmanager-lcgpbs ce01.macc.unican.es 8 1 ScientificSLSL x86_6 2400 0 4058/4058 0/0 0/34/36 jobmanager-lcgsge ce01.up.pt 9 1 ScientificSLSL x86_6 2400 0 4058/4058 0/0 0/22/22 jobmanager-lcgsge grid001.fe.up.pt 10 1 ScientificSLBer i686 2330 0 2048/2048 0/0 0/18/18 cream-pbs gantt.cefet-rj.br 11 1 0 0 0/0 0/0 0/0/21818 cream-lsf ce206.cern.ch 12 1 0 0 0/0 0/0 0/833/21818 cream-lsf ce204.cern.ch 13 1 0 0 0/0 0/0 0/0/21818 cream-lsf ce205.cern.ch 14 1 0 0 0/0 0/0 0/0/21818 cream-lsf ce207.cern.ch 15 1 0 0 0/0 0/0 0/833/21818 cream-lsf ce208.cern.ch 16 1 CentOSFinal x86_6 2670 0 12000/12000 0/0 0/25/40 cream-pbs tochtli64.nucleares.unam.mx 17 1 Scientific Linu x86_6 2000 0 8178/8178 0/0 0/55/200 cream-pbs ce02.eela.if.ufrj.br 18 1 ScientificSLBor x86_6 3000 0 2048/2048 0/0 0/24/24 cream-pbs cream01.cecalc.ula.ve 19 1 0 0 0/0 0/0 0/0/0 ce.egee.di.uminho.pt 20 1 ScientificCERNS x86_6 1600 0 4096/4096 0/0 0/138/188 cream-pbs cale.uniandes.edu.co 21 1 ScientificSLSL x86_6 2400 0 4058/4058 0/0 0/20/22 cream-sge grid001.fc.up.pt
- Prepare for job submission. "$GW_LOCATION/examples" contains templates to submit. Job templates allow you to configure job requirements, in terms of needed files, generated files, requirements and ranks of execution hosts. The next template is basically a Linux date command. Submit it to check DRM4G.
[user@ui~]$ cat $GW_LOCATION/examples/date/date.jt EXECUTABLE=/bin/date
- Submit the job. gwsubmit command can submit a job to available resources:
[user@ui~]$ gwsubmit $GW_LOCATION/examples/date/date.jt
- Check the evolution of the job. gwps command reports current job status:
If you execute successive gwps, you can see the differents states of job:
[user@ui~]$ gwps USER JID DM EM START END EXEC XFER EXIT NAME HOST user:0 0 pend ---- 19:39:09 --:--:-- 0:00:00 0:00:00 -- date.jt --
user:0 0 pend ---- 19:39:09 --:--:-- 0:00:00 0:00:00 -- date.jt -- user:0 0 prol ---- 19:39:09 --:--:-- 0:00:00 0:00:00 -- date.jt -- user:0 0 wrap pend 19:39:09 --:--:-- 0:00:00 0:00:00 -- date.jt ce01.macc.unican.es/jobmanager-lcgpbs user:0 0 wrap actv 19:39:09 --:--:-- 0:00:05 0:00:00 -- date.jt ce01.macc.unican.es/jobmanager-lcgpbs user:0 0 epil ---- 19:39:09 --:--:-- 0:00:10 0:00:00 -- date.jt ce01.macc.unican.es/jobmanager-lcgpbs user:0 0 done ---- 19:39:09 19:39:27 0:00:10 0:00:01 0 date.jt ce01.macc.unican.es/jobmanager-lcgpbs
pend => he job is waiting for a resource to run on
prol => the remote system is being prepared for execution.
wrap pend => the job has been successfully submitted to the local DRM system and it is waiting for the local DRM system.
wrap actv => the job is being executed by the local DRM system.
epil => the job is finalizing
done => the job has finished
- Results are standard output (stdout) and standard error (stderr), both files will be in the same directory of job template:
[user@ui~]$ cat $GW_LOCATION/examples/date/stdout.0 jue may 5 20:01:56 CEST 2011 [user@ui~]$ cat $GW_LOCATION/examples/date/stderr.0
Appendix A
The following changes have been
Support
If you have any problem or question, you should put a Ticket.
Attachments (5)
-
drm4g_gridway_x86_64_r1158.zip
(1.9 MB) -
added by carlos 11 years ago.
drm4g_gridway_x86_64_r1158
-
drm4g_gridway_x86_64_r1199.zip
(3.5 MB) -
added by carlos 11 years ago.
drm4g_gridway_x86_64_r1199.zip
-
drm4g_gridway_x86_64_r1283.zip
(3.7 MB) -
added by carlos 11 years ago.
drm4g_gridway_x86_64_r1283
-
drm4g_gridway_x86_64_r1288.tar.gz
(3.0 MB) -
added by carlos 11 years ago.
drm4g_gridway_x86_64_r1288.tar.gz
-
gw_im_mad_mds2_glue-bdii
(7.9 KB) -
added by carlos 11 years ago.
gw_im_mad_mds2_glue-bdii