Further Job Options¶
Processor Types¶
BlueBEAR is made up of nodes with different processors. By default, your job will be allocated to use any of the available processor types, but a single job will not be split across multiple different processor types. It is not normally necessary to choose a processor type, but if your job has particular performance or processor instruction needs (see below) then you may wish to do so.
Note
Some of the memory a node has is for running system processes and will be unavailable to jobs.
Emerald Rapids¶
In autumn 2024 we launched water-cooled nodes with Emerald Rapids CPUs:
#SBATCH --constraint=emerald
Each of these nodes has 2x 56 core Intel® Xeon® Platinum 8570 and 512GB memory.
Sapphire Rapids¶
In summer 2024 we launched water-cooled nodes with Sapphire Rapids CPUs:
#SBATCH --constraint=sapphire
Each of these nodes has 2x 56 core Intel® Xeon® Platinum 8480CL and 512GB memory.
Ice Lake¶
In 2021 we launched water-cooled nodes with Ice Lake CPUs:
#SBATCH --constraint=icelake
Each of these nodes has 2x 36 core Intel® Xeon® Platinum 8360Y and 512GB memory.
Cascade Lake¶
In 2019 we launched water-cooled nodes with Cascade Lake CPUs:
#SBATCH --constraint=cascadelake
Each of these nodes has 2x 20 core Intel® Xeon® Gold 6248 and 192GB memory.
Use Local Disk Space¶
If a job uses significant I/O (Input/Output) then files should be
created using the local disk space and only written back to the final
directory when the job is completed. This is particularly important if a
job makes heavy use of disk for scratch or work files. Heavy I/O to the
network file-stores such as the home directories or /rds
can cause
problems for other users. There is a directory /scratch
which is local
to each node that can be used for temporary files that are associated
with a job. The size of /scratch
on each node type is detailed on
the resources page.
For jobs that are running on a single node, this file-store can also be
used for input and output files for a job. Since /scratch
is not
shared across nodes it cannot be used for parallel jobs that use
multiple nodes where all of the processes need to be able to read from
or write to a shared file-store; but it can be used for multi-core jobs
on a single node.
To use scratch space, include the following lines at the start of your
job script (after the #SBATCH
headers):
BB_WORKDIR=$(mktemp -d /scratch/${USER}_${SLURM_JOBID}.XXXXXX)
export TMPDIR=${BB_WORKDIR}
These two lines will create a directory for you based on your username
(USER
), the ID of the job you are running (SLURM_JOBID
) followed by
a period and six random letters. The use of random letters is
recommended for security reasons. This is then exported to the
environment as TMPDIR
. Many applications utilise TMPDIR
for
temporary storage while it is running. If you want to copy your files
back to the current directory at the end, insert this line:
test -d ${BB_WORKDIR} && /bin/cp -r ${BB_WORKDIR} .
This checks that the directory still exists (using test -d
) then
copies it and everything in it (using -r
, for recursive copy) to the
current directory (.
) if it does. Then, at the end of your job script,
insert the line:
test -d ${BB_WORKDIR} && /bin/rm -rf ${BB_WORKDIR}
This checks that the directory still exists (using test -d
) then
removes it if it does.
Multi-core Jobs¶
There are several ways to run a multi-core job using Slurm but the
methods we recommend are to use the options
--ntasks
and
--nodes
.
For most people, it is enough to specify only --ntasks
. This is
because Slurm is sophisticated enough to take your job and spread it
across as many nodes as necessary to meet the number of cores that the
job requires. This is practical for MPI jobs and means that cluster is
used more efficiently (as multiple users can share nodes). For example,
adding the following line will request a job with 20 cores on an
undefined number of nodes:
#SBATCH --ntasks=20
For jobs that require multiple cores but must remain on a certain number
of nodes (e.g. OpenMP jobs which can only run on a single node), the
option --nodes
can be specified with a minimum and maximum number of
nodes. For example, the first example here specifies that between 3 and
5 nodes should be used; and the second example that a single node should
be used:
#SBATCH --nodes=3-5
#SBATCH --nodes=1
The environment variable SLURM_NTASKS
is set to the number of tasks
requested. For single node, multi-core jobs this can be used directly to
make the software fit the resources assigned to the job. For example:
./my_application --my-threads=${SLURM_NTASKS}
It is important to note that not all jobs scale well over multiple cores and sadly just doubling will often not double the speed. Software must be written to use extra cores so make sure the application you wish to run can make use of the cores you request. Useful advice if you are not certain is to submit some short runs of your jobs and see whether they perform as well as they should.
Note that requesting more cores will probably mean that you need to
queue for longer before the job can start. Also, using the --nodes
option will often lengthen the queue time.
An efficient way to parallelise your work without needing to know whether your application supports multi-threading is to use array jobs. See the following section for more details on this.
Job Checkpointing¶
For jobs containing commands that you expect to run for 1 day or longer we advise that some form of checkpointing is implemented, to allow resumption of the job should it fail. The method for achieving this will vary depending on the software applications being used but many that are designed for long-running processes will have checkpoint features built-in.
Checkpointing can also be used in conjunction with an array job in
situations where a workflow needs to exceed the maximum wall-time of the cluster.
For example, including the following Slurm SBATCH header will run your job 10 times in
series:
#SBATCH array=1-10%1
See the documentation on array job syntax for further information.