Using sbatch instead of srun #
Currently, Enroot / Pyxis only integrates with srun
.
You can, however, create your sbatch
script as you
would normally do and make it internally call srun
with the
required container-specific options for you:
Make sure that no actual command appears within the commented SBATCH preamble!
#!/bin/bash
# let's set the following defaults (can be overriden on commandline):
#SBATCH --job-name sbatch_test
#SBATCH --partition batch
# put your srun command with args here
srun \
--container-image=/enroot/nvcr.io_nvidia_pytorch_23.12-py3.sqsh \
--container-workdir="`pwd`" \
--container-mounts=/netscratch:/netscratch,/ds:/ds:ro,"`pwd`":"`pwd`" \
echo "hello world!"
Finally, submit your batch job (sbatch [script]
).
Output is by default saved to a slurm-JOBID.out
file in the current directory.
See sbatch documentation for more details.
Job Arrays #
One very handy feature that sbatch
supports which isn’t available for srun
is job arrays.
These are especially useful if you want to run a whole series of jobs / experiments (e.g., for different
hyperparameters / one run per input file).
#!/bin/bash
# let's set the following defaults (can be overriden on commandline):
#SBATCH --array 0-4%3
#SBATCH --job-name sbatch_array_test
#SBATCH --partition batch
srun \
--container-image=/enroot/nvcr.io_nvidia_pytorch_23.12-py3.sqsh \
--container-workdir="`pwd`" \
--container-mounts=/netscratch:/netscratch,/ds:/ds:ro,"`pwd`":"`pwd`" \
echo "hello world! array index: $SLURM_ARRAY_TASK_ID"
Finally, submit your batch job (sbatch [script]
).
The example runs 5 jobs in total (0-4
), taking care to run at most 3 in parallel %3
.
(One can also run arrays in step-sizes (e.g., :7
).)
As you can see, the job can access the array index via the $SLURM_ARRAY_TASK_ID
env var.
Output is by default saved to a slurm-JOBID-TASKID.out
file in the current directory.
See sbatch documentation for more details.
One run per input file #
When using job arrays, the following bash pattern might be useful to run a script once per input-file:
#SBATCH --array=0-42
FILES=(
somedir/*.csv.gz
)
srun ... my_script.py FILES[$SLURM_ARRAY_TASK_ID]
Wrap #
If you don’t want to create a separate script file, you can also
use the --wrap
option to create a (kind of lengthy and cumbersome)
command line only version:
sbatch \
--array "0-4%3" --job-name sbatch_test --partition batch \
--wrap "srun \
--container-image=/enroot/nvcr.io_nvidia_pytorch_23.12-py3.sqsh \
--container-workdir=\"`pwd`\" \
--container-mounts=/netscratch:/netscratch,/ds:/ds:ro,\"`pwd`\":\"`pwd`\" \
echo \"hello world! array index: \$SLURM_ARRAY_TASK_ID\""
You need to be a bit more careful with parameter expansion though (note the escaping of the last var, so it’s not expanded at submit time).