Further Job Options¶

Processor Types¶

BlueBEAR is made up of nodes with different processors. By default, your job will be allocated to use any of the available processor types, but a single job will not be split across multiple different processor types. It is not normally necessary to choose a processor type, but if your job has particular performance or processor instruction needs (see below) then you may want to do so.

Note

Some of the memory a node has is for running system processes and will be unavailable to jobs.

Emerald Rapids¶

In autumn 2024 we launched water-cooled nodes with Emerald Rapids CPUs:

#SBATCH --constraint=emerald

Each of these nodes has 2x 56 core Intel® Xeon® Platinum 8570 and 512GB memory.

Sapphire Rapids¶

In summer 2024 we launched water-cooled nodes with Sapphire Rapids CPUs:

#SBATCH --constraint=sapphire

Each of these nodes has 2x 56 core Intel® Xeon® Platinum 8480CL and 512GB memory.

Ice Lake¶

In 2021 we launched water-cooled nodes with Ice Lake CPUs:

#SBATCH --constraint=icelake

Each of these nodes has 2x 36 core Intel® Xeon® Platinum 8360Y and 512GB memory.

Use Local Disk Space¶

If a job uses significant I/O (Input/Output) then files should be created using the local disk space and only written back to the final directory when the job is completed. This is particularly important if a job makes heavy use of disk for scratch or work files. Heavy I/O to the network file-stores such as the home directories or /rds can cause problems for other users. There is a directory /scratch which is local to each node that can be used for temporary files that are associated with a job. The size of /scratch on each node type is detailed on the resources page.

For jobs that are running on a single node, this file-store can also be used for input and output files for a job. Since /scratch is not shared across nodes it cannot be used for parallel jobs that use multiple nodes where all of the processes need to be able to read from or write to a shared file-store; but it can be used for multi-core jobs on a single node.

To use scratch space, include the following lines at the start of your job script (after the #SBATCH headers):

BB_WORKDIR=$(mktemp -d /scratch/${USER}_${SLURM_JOBID}.XXXXXX)
export TMPDIR=${BB_WORKDIR}

These two lines will create a directory for you based on your username (USER), the ID of the job you are running (SLURM_JOBID) followed by a period and six random letters. The use of random letters is recommended for security reasons. This is then exported to the environment as TMPDIR. Many applications utilise TMPDIR for temporary storage while it is running. If you want to copy your files back to the current directory at the end, insert this line:

test -d ${BB_WORKDIR} && /bin/cp -r ${BB_WORKDIR} .

This checks that the directory still exists (using test -d) then copies it and everything in it (using -r, for recursive copy) to the current directory (.) if it does. Then, at the end of your job script, insert the line:

test -d ${BB_WORKDIR} && /bin/rm -rf ${BB_WORKDIR}

This checks that the directory still exists (using test -d) then removes it if it does.

Multi-core Jobs¶

There are several ways to run a multi-core job using Slurm but the methods we recommend are to use the options --ntasks and --nodes. For most people, it is enough to specify only --ntasks. This is because Slurm is sophisticated enough to take your job and spread it across as many nodes as necessary to meet the number of cores that the job requires. This is practical for MPI jobs and means that cluster is used more efficiently (as multiple users can share nodes). For example, adding the following line will request a job with 20 cores on an undefined number of nodes:

#SBATCH --ntasks=20

For jobs that require multiple cores but must remain on a certain number of nodes (e.g. OpenMP jobs which can only run on a single node), the option --nodes can be specified with a minimum and maximum number of nodes. For example, the first example here specifies that between 3 and 5 nodes should be used; and the second example that a single node should be used:

#SBATCH --nodes=3-5

#SBATCH --nodes=1

The environment variable SLURM_NTASKS is set to the number of tasks requested. For single node, multi-core jobs this can be used directly to make the software fit the resources assigned to the job. For example:

./my_application --my-threads=${SLURM_NTASKS}

It is important to note that not all jobs scale well over multiple cores and sadly just doubling will often not double the speed. Software must be written to use extra cores so make sure the application you want to run can make use of the cores you request. Useful advice if you are not certain is to submit some short runs of your jobs and see whether they perform as well as they should.

Note that requesting more cores will probably mean that you need to queue for longer before the job can start. Also, using the --nodes option will often lengthen the queue time.

An efficient way to parallelise your work without needing to know whether your application supports multi-threading is to use array jobs. See the following section for more details on this.

Job Checkpointing¶

For jobs containing commands that you expect to run for 1 day or longer we advise that some form of checkpointing is implemented, to allow resumption of the job should it fail. The method for achieving this will vary depending on the software applications being used but many that are designed for long-running processes will have checkpoint features built-in.

Checkpointing can also be used in conjunction with an array job in situations where a workflow needs to exceed the maximum wall-time of the cluster. For example, including the following Slurm SBATCH header will run your job 10 times in series:

#SBATCH array=1-10%1

See the documentation on array job syntax for further information.

Job Emails¶

Warning

This option should be used for a small number of jobs only. It especially should not be enabled for array jobs or short running jobs.

Slurm can be set to email you about that status of a job. See the Slurm documentation for --mail-type for information on the available options.

Emails about jobs

The Slurm job notification emails can only be sent to University email address. Using the Slurm option to specify an external email address will result in no email being delivered.

Job Dependencies¶

Slurm allows you to specify dependencies between jobs, which is convenient for workflows with multiple jobs that need to run in a particular order. For example, if you have a job that generates some data and another job that processes that data, you can specify that the second job should not start until the first job has completed successfully.

Dependency Types

Frequently used dependency types are:

afterok: The job will start after the specified job has completed successfully.
afternotok: The job will start after the specified job has completed unsuccessfully.
afterany: The job will start after the specified job has completed, regardless of whether it was successful or not.

It's also possible to specify multiple dependencies of different types and with different constraints. For further information, please refer to the Slurm --dependency documentation.

Process:

Submit the first job and note its job ID
Submit a subsequent job to begin after the first completes

The --dependency option is used to specify the type of dependency that you want to create. (See the Dependency Types info box for further information on available options.)

$ sbatch part1.sh
Submitted batch job 1234567
$ sbatch --dependency=afterok:1234567 part2.sh
Submitted batch job 1234568

Alternatively you can use the --parsable option to get the job ID of the first job and store it in a variable, which can then be passed when the second job is submitted.

$ JOBID_PART_1=$(sbatch --parsable part1.sh)
$ sbatch --dependency=afterok:$JOBID_PART_1 part2.sh
Submitted batch job 1234568