Version 5 (modified by carlos, 9 years ago) (diff)

--

# ​DKRZ

How to use DKRZ facilities?

Workflows in climate modelling research are complex and comprise, in general, a number of different tasks, such as model formulation and development (including debugging, platform porting, and performance optimization), generation of input data, performing model simulations, postprocessing, visualization and analysis of output data, long-term archiving of the data, documentation and publication of results. The DKRZ hardware and software infrastructure is optimally adapted to accomplish these tasks in an efficient way. In the graphic below we give a schematic overview on the DKRZ systems.

For a more detailed description of the different systems shown in the picture and basic software installed on these systems click here.

## Blizzard

ssh  <userid>@blizzard.dkrz.de


## Lizard

ssh  <userid>@lizard.dkrz.de


# RES - Red Española de Supercomputación

## Altamira

ssh  <userid>@altamira1.ifca.es


### Running Jobs

SLURM is the utility used at Altamira for batch processing support, so all jobs must be run through it. This document provides information for getting started with job execution at Altamira.

In order to keep the login nodes in a propper load, a 10 minutes limitation in the cpu time is set for processes running interactively in these nodes. Any execution taking more than this limit should be carried out through the queue system.

### Submitting Jobs

A job is the execution unit for the SLURM. A job is defined by a text file containing a set of directives describing the job, and the commands to execute.

These are the basic directives to submit jobs:

mnsubmit <job_script> submits a job script to the queue system (see below for job script directives).

mnq shows all the jobs submitted.

mncancel <job_id> removes his/her job from the queue system, canceling the execution of the job if it was already running.

### Job directives

A job must contain a series of directives to inform the batch system about the characteristics of the job. These directives appear as comments in the job script, with the following syntax:

#@ directive = value


Additionally, the job script may contain a set of commands to execute. If not, an external script must be provided with the 'executable' directive. Here you may find the most common directives:

#@ class = class_name


The queue where the job is to be submitted. Let this field empty unless you need to use "debug" or special queues.

#@ wall_clock_limit = HH:MM:SS


The limit of wall clock time. This is a mandatory field and you must set it to a value greater than the real execution time for your application and smaller than the time limits granted to the user. Notice that your job will be killed after the elapsed period.

#@ initialdir = pathname


The working directory of your job (i.e. where the job will run). If not specified, it is the current working directory at the time the job was submitted.

#@ error = file


The name of the file to collect the stderr output of the job.

#@ output = file


The name of the file to collect the standard output (stdout) of the job.

#@ total_tasks = number


The number of processes to start.

#@ cpus_per_task = number


The number of cpus allocated for each task. This is useful for hybrid MPI+OpenMP applications, where each process will spawn a number of threads. The number of cpus per task must be between 1 and 16, since each node has 16 cores (one for each thread).

#@ tasks_per_node = number


The number of tasks allocated in each node. When an application uses more than 3.8 GB of memory per process, it is not possible to have 16 processes in the same node and its 64GB of memory. It can be combined with the cpus_per_task to allocate the nodes exclusively, i.e. to allocate 2, processes per node, set both directives to 2. The number of tasks per node must be between 1 and 16.

# @ gpus_per_node = number


The number of GPU cards assigned to the job. This number can be [0,1,2] as there are 2 cards per node.

### Job Examples

In the examples, the %j part in the job directives will be sustitute by the job ID.

Sequential job:

#!/bin/bash
#@ job_name = test_serial
#@ initialdir = .
#@ output = serial_%j.out
#@ error = serial_%j.err
#@ wall_clock_limit = 00:02:00

./serial_binary


Parallel job:

#!/bin/bash
#@ job_name = test_parallel
#@ initialdir = .
#@ output = mpi_%j.out
#@ error = mpi_%j.err
#@ wall_clock_limit = 00:02:00

srun ./parallel_binary


GPGPU job:

#!/bin/bash
#@ job_name = test_gpu
#@ initialdir = .
#@ output = gpu_%j.out
#@ error = gpu_%j.err
#@ gpus_per_node = 1
#@ wall_clock_limit = 00:02:00

./gpu_binary


The jobs with GPU should execute module load CUDA in order to set the library paths before running mnsubmit.

## MareNostrum

ssh  <userid>@mn1.bsc.es


### Running Jobs

LSF is the utility used at MareNostrum? III for batch processing support, so all jobs must be run through it. This document provides information for getting started with job execution at the Cluster. 5.1. Submitting jobs A job is the execution unit for LSF. A job is defined by a text file containing a set of directives describing the job, and the commands to execute. Please, bear in mind that there is a limit of 3600 bytes for the size of the text file. 5.1.1. LSF commands These are the basic directives to submit jobs:

• bsub < job_script

submits a “job script” to the queue system (see below for job script directives). Remember to pass it through STDIN '<'

• bjobs [-w][-X][-l job_id]

shows all the submitted jobs.

• bkill <job_id>

remove the job from the queue system, canceling the execution of the processes, if they were still running. 5.1.2. Job directives A job must contain a series of directives to inform the batch system about the characteristics of the job. These directives appear as comments in the job script, with the following syntax: #BSUB -option value #BSUB -J job_name The name of the job. #BSUB -q debug This queue is only intended for small tests, so there is a limit of 1 job per user, using up to 64 cpus (4 nodes), and one hour of wall clock limit. #BSUB -W HH:MM NOTE: take into account that you can not specify the amount of seconds in LSF. The limit of wall clock time. This is a mandatory field and you must set it to a value greater than the real execution time for your application and smaller than the time limits granted to the user. Notice that your job will be killed after the elapsed period #BSUB -cwd pathname The working directory of your job (i.e. where the job will run). If not specified, it is the current working directory at the time the job was submitted. #BSUB -e/-eo file The name of the file to collect the stderr output of the job. You can use %J for job_id. -e option will APPEND the file, -eo will REPLACE the file. 7 MareNosutrm? III User's Guide #BSUB -o/-oo file The name of the file to collect the standard output (stdout) of the job. -o option will APPEND the file, -oo will REPLACE the file. #BSUB -n number The number of processes to start. #BSUB -R"span[ptile=number]" The number of processes assigned to a node. We really encourage you to read the manual of bsub command to find out other specifications that will help you to define the job script. man bsub 5.1.3. Examples

Sequential job :

#!/bin/bash
#BSUB -n 1
#BSUB -oo output_%J.out
#BSUB -eo output_%J.err
#BSUB -J sequential
#BSUB -W 00:05

./serial.exe


The job would be submitted using:

bsub < ptest.cmd


Sequential job using OpenMP :

#!/bin/bash
#BSUB -n 1
#BSUB -oo output_%J.out
#BSUB -eo output_%J.err
#BSUB -J sequential_OpenMP
#BSUB -W 00:05

./serial.exe


Parallel job :

#!/bin/bash
#BSUB -n 128
#BSUB -o output_%J.out
#BSUB -e output_%J.err
# In order to launch 128 processes with 16 processes per node:
#BSUB -R"span[ptile=16]"
#BSUB -x # Exclusive use
#BSUB -J parallel
#BSUB -W 02:00
# You can choose the parallel environment through modules

mpirun ./wrf.exe


#!/bin/bash
# The total number of MPI processes:
#BSUB -n 128
#BSUB -oo output_%J.out
#BSUB -eo output_%J.err
# It will allocate 4 MPI processes per node:
#BSUB -R"span[ptile=4]"
#BSUB -x # Exclusive use
#BSUB -J hybrid
#BSUB -W 02:00
# You can choose the parallel environment through
# modules

# 4 MPI processes per node and 16 cpus available
# (4 threads per MPI process):

mpirun ./wrf.exe


# ECMWF

ssh <user>@ecaccess.ecmwf.int