These scripts perform an efficient, minimal setup to run various jobs with
various computational codes. The following steps are performed:
Loads the required modules.
Check that the required modules are loaded.
Activate a virtual environment with ASE installed (assumed to be located at
/software/python/virtualenvs/ase/).
samples/slurm/activateASEVasp.sh
#!/bin/bashmodulepurgemoduleloadStdEnv/2023intel/2023.2.1openmpi/4.1.5moduleloadvasp/6.4.2moduleloadpython/3.11.5scipy-stackif[[$(modulelist|grep'intel/2023.2.1')==""||$(modulelist|grep'python/3.11.5')==""||$(modulelist|grep'vasp/6.4.2')==""]];thenecho"Your modules are not loaded correctly. Cancelling job... "exit1elseecho"Your modules are loaded correctly. Proceeding to activate ASE..."fiecho"Changing directory to ~/software/python/virtualenvs/ase ..."cd~/software/python/virtualenvs/ase||exitfunctionload_ase(){source~/software/python/virtualenvs/ase/bin/activate}if[[$(pwd|grep'ase')==*/software/python/virtualenvs/ase]];thenpwdecho"You are in the right location! Activating ase..."load_aseelseecho"Please ensure you have the correct directory struture (~/software/python/virtualenvs/ase)..."echo"Exiting"exit1fi
samples/slurm/activateASEGaussian.sh
#!/bin/bash# Gaussian does not have any dependencies :angel:modulepurgemoduleloadgaussian/g16.c01moduleloadpython/3.11.5scipy-stackif[[$(modulelist|grep'intel/2023.2.1')==""||$(modulelist|grep'python/3.11.5')==""||$(modulelist|grep'gaussian/g16.c01')==""]];thenecho"Your modules are not loaded correctly for Gaussian. Cancelling job... "exit1elseecho"Your modules are loaded correctly for Gaussian. Proceeding to activate ASE..."fiecho"Changing directory to ~/software/python/virtualenvs/ase ..."cd~/software/python/virtualenvs/ase||exitfunctionload_ase(){source~/software/python/virtualenvs/ase/bin/activate}if[[$(pwd|grep'ase')==*/software/python/virtualenvs/ase]];thenpwdecho"You are in the right location! Activating ase..."load_aseelseecho"Please ensure you have the correct directory struture (~/software/python/virtualenvs/ase)..."echo"Exiting"exit1fi
samples/slurm/activateASEORCA.sh
#!/bin/bashmodulepurgemoduleloadStdEnv/2020gcc/10.3.0openmpi/4.1.1moduleloadorca/5.0.4moduleloadpython/3.11.5scipy-stackif[[$(modulelist|grep'intel/2023.2.1')==""||$(modulelist|grep'python/3.11.5')==""||$(modulelist|grep'orca/5.0.4')==""]];thenecho"Your modules are not loaded correctly for ORCA. Cancelling job... "exit1elseecho"Your modules are loaded correctly for ORCA. Proceeding to activate ASE..."exportPATH="${EBROOTORCA}/:$PATH"fiecho"Changing directory to ~/software/python/virtualenvs/ase ..."cd~/software/python/virtualenvs/ase||exitfunctionload_ase(){source~/software/python/virtualenvs/ase/bin/activate}if[[$(pwd|grep'ase')==*/software/python/virtualenvs/ase]];thenpwdecho"You are in the right location! Activating ase..."load_aseelseecho"Please ensure you have the correct directory struture (~/software/python/virtualenvs/ase)..."echo"Exiting"exit1fi
Reminder
Don't forget to add logic to actually run your job at the end of these
scripts (e.g., python3 run.py).
Submit a DFT calculation
These samples scripts are relatively bloated in comparison to those in the
previous section. The scripts perform the
following steps:
Loads your bash profile file and the required modules for the computational code.
Creates a scratch directory dedicated to the job that is uniquely
identified by the SLURM job ID and creates a symlink to the scratch
directory for convenience. (This is especially useful if the jobterminates
unexpectedly during execution.)
Copies files to the scratch directory.
Initiates the calculation by running a Python script (presumably, run.py).
Stops the job at 90% of the maximum run time to ensure enough time remains
to copy files from the scratch directory to the submission directory.
Cleans up the scratch directory.
Logs the completion of the job in a file in your home directory ~/job.log.
Additionally, the script prints out debugging information that may be useful
for identifying issues with running jobs (e.g., resource information, job ID,
etc.).
```pytitle="samples/slurm/vasp.sh"linenums="1"hl_lines="96"#!/bin/bash#SBATCH --account=def-samiras#SBATCH --job-name=JOB_NAME#SBATCH --mem-per-cpu=1000MB#SBATCH --nodes=2#SBATCH --ntasks-per-node=24#SBATCH --time=23:00:00#SBATCH --mail-user=SFU_ID@sfu.ca#SBATCH --mail-type=BEGIN,END,FAIL,TIME_LIMIT,TIME_LIMIT_90echo" "echo"### Setting up shell environment ..."echo" "iftest-e"/etc/profile";thensource"/etc/profile"fiiftest-e"$HOME/.bash_profile";thensource"$HOME/.bash_profile"fiunsetLANGmodulepurgemoduleloadvaspmoduleloadpython/3.11.9# Replace "$COMP_CHEM_ENV" with the path to your Python virtual environmentsource"$COMP_CHEM_ENV"exportLC_ALL="C"exportMKL_NUM_THREADS=1exportOMP_NUM_THREADS=1ulimit-sunlimitedecho" "echo"### Printing basic job infos to stdout ..."echo" "echo"START_TIME = $(date '+%y-%m-%d %H:%M:%S %s')"echo"HOSTNAME = ${HOSTNAME}"echo"USER = ${USER}"echo"SLURM_JOB_NAME = ${SLURM_JOB_NAME}"echo"SLURM_JOB_ID = ${SLURM_JOB_ID}"echo"SLURM_SUBMIT_DIR = ${SLURM_SUBMIT_DIR}"echo"SLURM_JOB_NUM_NODES = ${SLURM_JOB_NUM_NODES}"echo"SLURM_NTASKS = ${SLURM_NTASKS}"echo"SLURM_NODELIST = ${SLURM_NODELIST}"echo"SLURM_JOB_NODELIST = ${SLURM_JOB_NODELIST}"iftest-f"${SLURM_JOB_NODELIST}";thenecho"SLURM_JOB_NODELIST (begin) ----------"cat"${SLURM_JOB_NODELIST}"echo"SLURM_JOB_NODELIST (end) ------------"fiecho"--------------- ulimit -a -S ---------------"ulimit-a-Secho"--------------- ulimit -a -H ---------------"ulimit-a-Hecho"----------------------------------------------"echo" "echo"### Creating TMP_WORK_DIR directory and changing to it ..."echo" "iftest-e"$HOME/scratch";thenTMP_WORK_DIR="$HOME/scratch/${SLURM_JOB_ID}"eliftest-e/scratch/"${SLURM_JOB_ID}";thenTMP_WORK_DIR=/scratch/${SLURM_JOB_ID}elseTMP_WORK_DIR="$(pwd)"fiTMP_BASE_DIR="$(dirname "$TMP_WORK_DIR")"JOB_WORK_DIR="$(basename "$TMP_WORK_DIR")"echo"TMP_WORK_DIR = ${TMP_WORK_DIR}"echo"TMP_BASE_DIR = ${TMP_BASE_DIR}"echo"JOB_WORK_DIR = ${JOB_WORK_DIR}"# Creating a symbolic link to temporary directory holding work files while job runningif!test-e"${TMP_WORK_DIR}";thenmkdir"${TMP_WORK_DIR}"filn-s"${TMP_WORK_DIR}"scratch_dircd"${TMP_WORK_DIR}"||exitecho" "echo"### Copying input files for job (if required):"echo" "script_name="${BASH_SOURCE[0]}"AUTOJOB_SLURM_SCRIPT="$(basename "$script_name")"exportAUTOJOB_SLURM_SCRIPTexportAUTOJOB_PYTHON_SCRIPT="{{ python_script }}"exportAUTOJOB_COPY_TO_SCRATCH="CHGCAR,,*py,*cif,POSCAR,coord,*xyz,*.traj,CONTCAR,*.pkl,*xml,WAVECAR"cp-v"$SLURM_SUBMIT_DIR"/{CHGCAR,,*py,*cif,POSCAR,coord,*xyz,*.traj,CONTCAR,*.pkl,*xml,WAVECAR}"$TMP_WORK_DIR"/echo" "# Preemptively end job if getting close to time limittimeline=$(grep-E-m1'^#SBATCH[[:space:]]*--time='"$script_name")timeslurm=${timeline##*=}IFS=-read-raday_split_time<<<"$timeslurm"no_days_time=${day_split_time[1]}days=${no_days_time:+${day_split_time[0]}}no_days_time=${day_split_time[1]:-${day_split_time[0]}}IFS=:read-rasplit_time<<<"$no_days_time"# Time formats with days: D-H, D-H:M, D-H:M:Sif[[$days]];thenslurm_days="$days"slurm_hours=${split_time[0]}slurm_minutes=${split_time[1]:-0}slurm_seconds=${split_time[2]:-0}# Time format without days: M, M:S, H:M:Selseslurm_days=0if[[${#split_time[*]} == 3 ]]; thenslurm_hours=${split_time[0]}slurm_minutes=${split_time[1]}slurm_seconds=${split_time[2]}elseslurm_hours=0slurm_minutes=${split_time[0]}slurm_seconds=${split_time[1]:-0}fifiecho"Running for $(echo "$slurm_days*1" |bc)d $(echo "$slurm_hours*1" |bc)h $(echo "$slurm_minutes*1" |bc)m and $(echo "$slurm_seconds*1" |bc)s."timeslurm=$(echo"$slurm_days*86400 + $slurm_hours*3600 + $slurm_minutes*60 + $slurm_seconds"|bc)echo"This means $timeslurm seconds."timeslurm=$(echo"$timeslurm *0.9"|bc)echo"Will terminate at ${timeslurm}s to copy back necessary files from scratch"echo""echo""# run ase calculation and timetimeout"${timeslurm}"python3"$AUTOJOB_PYTHON_SCRIPT"exit_code=$?if["$exit_code"-eq124];thenecho" "echo"Cancelled due to time limit."elseecho" "echo"Time limit not reached."fiecho" "echo"### Cleaning up files ... removing unnecessary scratch files ..."echo" "AUTOJOB_FILES_TO_DELETE="*.d2e *.int *.rwf *.skr *.inp EIGENVAL IBZKPT PCDAT PROCAR ELFCAR LOCPOT PROOUT TMPCAR vasp.dipcor"rm-vf"$AUTOJOB_FILES_TO_DELETE"sleep10# Sleep some time so potential stale nfs handles can disappear.echo" "echo"### Compressing results and copying back result archive ..."echo" "cd"${TMP_BASE_DIR}"||exitmkdir-vp"${SLURM_SUBMIT_DIR}"# if user has deleted or moved the submit direcho" "echo"Creating result tgz-file '${SLURM_SUBMIT_DIR}/${JOB_WORK_DIR}.tgz' ..."echo" "tar-zcvf"${SLURM_SUBMIT_DIR}/${JOB_WORK_DIR}.tgz""${JOB_WORK_DIR}" \
||{echo"ERROR: Failed to create tgz-file. Please cleanup TMP_WORK_DIR $TMP_WORK_DIR on host '$HOSTNAME' manually (if not done automatically by queueing system).";exit102;}echo" "echo"### Remove TMP_WORK_DIR ..."echo" "rm-rvf"${TMP_WORK_DIR}"echo" "echo"Extracting result tgz-file"echo" "cd"${SLURM_SUBMIT_DIR}"||exittar-xzf"${JOB_WORK_DIR}".tgzmv"${JOB_WORK_DIR}"/*.rm-r"${JOB_WORK_DIR}".tgz"${JOB_WORK_DIR}"rm"${SLURM_SUBMIT_DIR}/scratch_dir"echo"END_TIME = $(date +'%y-%m-%d %H:%M:%S %s')"# Record job in log fileecho"${SLURM_JOB_ID}-${SLURM_JOB_NAME}"iscomplete:on"$(date +'%y.%m.%d %H:%M:%S')""${SLURM_SUBMIT_DIR}">>~/job.logecho" "echo"### Exiting with exit code ${exit_code}..."echo" "exit"$exit_code"
```pytitle="samples/slurm/espresso.sh"linenums="1"#!/bin/bash#SBATCH --account=def-samiras#SBATCH --job-name=JOB_NAME#SBATCH --mem-per-cpu=1000MB#SBATCH --nodes=2#SBATCH --ntasks-per-node=24#SBATCH --time=23:00:00#SBATCH --mail-user=SFU_ID@sfu.ca#SBATCH --mail-type=BEGIN,END,FAIL,TIME_LIMIT,TIME_LIMIT_90echo" "echo"### Setting up shell environment ..."echo" "iftest-e"/etc/profile";thensource"/etc/profile"fiiftest-e"$HOME/.bash_profile";thensource"$HOME/.bash_profile"fiunsetLANGmodule--forcepurgemoduleloadgentoo/2020python/3.11.9espresso# Replace "$COMP_CHEM_ENV" with the path to your Python virtual environmentsource"$COMP_CHEM_ENV"exportLC_ALL="C"exportMKL_NUM_THREADS=1exportOMP_NUM_THREADS=1ulimit-sunlimitedecho" "echo"### Printing basic job infos to stdout ..."echo" "echo"START_TIME = $(date '+%y-%m-%d %H:%M:%S %s')"echo"HOSTNAME = ${HOSTNAME}"echo"USER = ${USER}"echo"SLURM_JOB_NAME = ${SLURM_JOB_NAME}"echo"SLURM_JOB_ID = ${SLURM_JOB_ID}"echo"SLURM_SUBMIT_DIR = ${SLURM_SUBMIT_DIR}"echo"SLURM_JOB_NUM_NODES = ${SLURM_JOB_NUM_NODES}"echo"SLURM_NTASKS = ${SLURM_NTASKS}"echo"SLURM_NODELIST = ${SLURM_NODELIST}"echo"SLURM_JOB_NODELIST = ${SLURM_JOB_NODELIST}"iftest-f"${SLURM_JOB_NODELIST}";thenecho"SLURM_JOB_NODELIST (begin) ----------"cat"${SLURM_JOB_NODELIST}"echo"SLURM_JOB_NODELIST (end) ------------"fiecho"--------------- ulimit -a -S ---------------"ulimit-a-Secho"--------------- ulimit -a -H ---------------"ulimit-a-Hecho"----------------------------------------------"echo" "echo"### Creating TMP_WORK_DIR directory and changing to it ..."echo" "iftest-e"$HOME/scratch";thenTMP_WORK_DIR="$HOME/scratch/${SLURM_JOB_ID}"eliftest-e/scratch/"${SLURM_JOB_ID}";thenTMP_WORK_DIR=/scratch/${SLURM_JOB_ID}elseTMP_WORK_DIR="$(pwd)"fiTMP_BASE_DIR="$(dirname "$TMP_WORK_DIR")"JOB_WORK_DIR="$(basename "$TMP_WORK_DIR")"echo"TMP_WORK_DIR = ${TMP_WORK_DIR}"echo"TMP_BASE_DIR = ${TMP_BASE_DIR}"echo"JOB_WORK_DIR = ${JOB_WORK_DIR}"# Creating a symbolic link to temporary directory holding work files while job runningif!test-e"${TMP_WORK_DIR}";thenmkdir"${TMP_WORK_DIR}"filn-s"${TMP_WORK_DIR}"scratch_dircd"${TMP_WORK_DIR}"||exitecho" "echo"### Copying input files for job (if required):"echo" "script_name="${BASH_SOURCE[0]}"AUTOJOB_SLURM_SCRIPT="$(basename "$script_name")"exportAUTOJOB_SLURM_SCRIPTexportAUTOJOB_PYTHON_SCRIPT="{{ python_script }}"exportAUTOJOB_COPY_TO_SCRATCH="CHGCAR,,*py,*cif,POSCAR,coord,*xyz,*.traj,CONTCAR,*.pkl,*xml,WAVECAR"cp-v"$SLURM_SUBMIT_DIR"/{CHGCAR,*py,*cif,POSCAR,coord,*xyz,*.traj,CONTCAR,*.pkl,*xml,WAVECAR}"$TMP_WORK_DIR"/echo" "# Preemptively end job if getting close to time limittimeline=$(grep-E-m1'^#SBATCH[[:space:]]*--time='"$script_name")timeslurm=${timeline##*=}IFS=-read-raday_split_time<<<"$timeslurm"no_days_time=${day_split_time[1]}days=${no_days_time:+${day_split_time[0]}}no_days_time=${day_split_time[1]:-${day_split_time[0]}}IFS=:read-rasplit_time<<<"$no_days_time"# Time formats with days: D-H, D-H:M, D-H:M:Sif[[$days]];thenslurm_days="$days"slurm_hours=${split_time[0]}slurm_minutes=${split_time[1]:-0}slurm_seconds=${split_time[2]:-0}# Time format without days: M, M:S, H:M:Selseslurm_days=0if[[${#split_time[*]} == 3 ]]; thenslurm_hours=${split_time[0]}slurm_minutes=${split_time[1]}slurm_seconds=${split_time[2]}elseslurm_hours=0slurm_minutes=${split_time[0]}slurm_seconds=${split_time[1]:-0}fifiecho"Running for $(echo "$slurm_days*1" |bc)d $(echo "$slurm_hours*1" |bc)h $(echo "$slurm_minutes*1" |bc)m and $(echo "$slurm_seconds*1" |bc)s."timeslurm=$(echo"$slurm_days*86400 + $slurm_hours*3600 + $slurm_minutes*60 + $slurm_seconds"|bc)echo"This means $timeslurm seconds."timeslurm=$(echo"$timeslurm *0.9"|bc)echo"Will terminate at ${timeslurm}s to copy back necessary files from scratch"echo""echo""# run ase calculation and timetimeout"${timeslurm}"python3"$AUTOJOB_PYTHON_SCRIPT"exit_code=$?if["$exit_code"-eq124];thenecho" "echo"Cancelled due to time limit."elseecho" "echo"Time limit not reached."fiecho" "echo"### Cleaning up files ... removing unnecessary scratch files ..."echo" "AUTOJOB_FILES_TO_DELETE="*.mix* *.wfc*"rm-vf"$AUTOJOB_FILES_TO_DELETE"sleep10# Sleep some time so potential stale nfs handles can disappear.echo" "echo"### Compressing results and copying back result archive ..."echo" "cd"${TMP_BASE_DIR}"||exitmkdir-vp"${SLURM_SUBMIT_DIR}"# if user has deleted or moved the submit direcho" "echo"Creating result tgz-file '${SLURM_SUBMIT_DIR}/${JOB_WORK_DIR}.tgz' ..."echo" "tar-zcvf"${SLURM_SUBMIT_DIR}/${JOB_WORK_DIR}.tgz""${JOB_WORK_DIR}" \
||{echo"ERROR: Failed to create tgz-file. Please cleanup TMP_WORK_DIR $TMP_WORK_DIR on host '$HOSTNAME' manually (if not done automatically by queueing system).";exit102;}echo" "echo"### Remove TMP_WORK_DIR ..."echo" "rm-rvf"${TMP_WORK_DIR}"echo" "echo"Extracting result tgz-file"echo" "cd"${SLURM_SUBMIT_DIR}"||exittar-xzf"${JOB_WORK_DIR}".tgzmv"${JOB_WORK_DIR}"/*.rm-r"${JOB_WORK_DIR}".tgz"${JOB_WORK_DIR}"rm"${SLURM_SUBMIT_DIR}/scratch_dir"echo"END_TIME = $(date +'%y-%m-%d %H:%M:%S %s')"# Record job in log fileecho"${SLURM_JOB_ID}-${SLURM_JOB_NAME}"iscomplete:on"$(date +'%y.%m.%d %H:%M:%S')""${SLURM_SUBMIT_DIR}">>~/job.logecho" "echo"### Exiting with exit code ${exit_code}..."echo" "exit"$exit_code"
Reminder
This script assumes that you are using a self-compiled version of
Quantum Espresso and have created a corresponding module named
espresso. See this tutorial
for how to compile Quantum Espresso and create the necessary
modulefile.
#!/bin/bash#SBATCH --account=def-samiras#SBATCH --job-name=JOB_NAME#SBATCH --mem-per-cpu=1000MB#SBATCH --nodes=2#SBATCH --ntasks-per-node=24#SBATCH --time=23:00:00#SBATCH --mail-user=SFU_ID@sfu.ca#SBATCH --mail-type=BEGIN,END,FAIL,TIME_LIMIT,TIME_LIMIT_90echo" "echo"### Setting up shell environment ..."echo" "iftest-e"/etc/profile";thensource"/etc/profile"fiiftest-e"$HOME/.bash_profile";thensource"$HOME/.bash_profile"fiunsetLANGmodulepurgemoduleloadgaussian/g16.c01moduleloadpython/3.11.9# Replace "$COMP_CHEM_ENV" with the path to your Python virtual environmentsource"$COMP_CHEM_ENV"exportLC_ALL="C"exportMKL_NUM_THREADS=1exportOMP_NUM_THREADS=1ulimit-sunlimitedecho" "echo"### Printing basic job infos to stdout ..."echo" "echo"START_TIME = $(date '+%y-%m-%d %H:%M:%S %s')"echo"HOSTNAME = ${HOSTNAME}"echo"USER = ${USER}"echo"SLURM_JOB_NAME = ${SLURM_JOB_NAME}"echo"SLURM_JOB_ID = ${SLURM_JOB_ID}"echo"SLURM_SUBMIT_DIR = ${SLURM_SUBMIT_DIR}"echo"SLURM_JOB_NUM_NODES = ${SLURM_JOB_NUM_NODES}"echo"SLURM_NTASKS = ${SLURM_NTASKS}"echo"SLURM_NODELIST = ${SLURM_NODELIST}"echo"SLURM_JOB_NODELIST = ${SLURM_JOB_NODELIST}"iftest-f"${SLURM_JOB_NODELIST}";thenecho"SLURM_JOB_NODELIST (begin) ----------"cat"${SLURM_JOB_NODELIST}"echo"SLURM_JOB_NODELIST (end) ------------"fiecho"--------------- ulimit -a -S ---------------"ulimit-a-Secho"--------------- ulimit -a -H ---------------"ulimit-a-Hecho"----------------------------------------------"echo" "echo"### Creating TMP_WORK_DIR directory and changing to it ..."echo" "iftest-e"$HOME/scratch";thenTMP_WORK_DIR="$HOME/scratch/${SLURM_JOB_ID}"eliftest-e/scratch/"${SLURM_JOB_ID}";thenTMP_WORK_DIR=/scratch/${SLURM_JOB_ID}elseTMP_WORK_DIR="$(pwd)"fi# Pass memory request, cpu list, and scratch directory to GaussianexportGAUSS_MDEF="${SLURM_MEM_PER_NODE}MB"GAUSS_CDEF=$(taskset-cp$$|awk-F':''{print $2}')exportGAUSS_CDEFexportGAUSS_SCRDIR=${TMP_WORK_DIR}TMP_BASE_DIR="$(dirname "$TMP_WORK_DIR")"JOB_WORK_DIR="$(basename "$TMP_WORK_DIR")"echo"TMP_WORK_DIR = ${TMP_WORK_DIR}"echo"TMP_BASE_DIR = ${TMP_BASE_DIR}"echo"JOB_WORK_DIR = ${JOB_WORK_DIR}"# Creating a symbolic link to temporary directory holding work files while job runningln-s"${TMP_WORK_DIR}"scratch_dircd"${TMP_WORK_DIR}"||exitecho" "echo"### Copying input files for job (if required):"echo" "script_name="${BASH_SOURCE[0]}"AUTOJOB_SLURM_SCRIPT="$(basename "$script_name")"exportAUTOJOB_SLURM_SCRIPTexportAUTOJOB_PYTHON_SCRIPT="run.py"exportAUTOJOB_COPY_TO_SCRATCH="*.chk,*.py,*.traj,*.rwf"cp-v"$SLURM_SUBMIT_DIR"/{*.chk,*.py,*.traj,*.rwf}"$TMP_WORK_DIR"/echo" "# Preemptively end job if getting close to time limittimeline=$(grep-E-m1'^#SBATCH[[:space:]]*--time='"$script_name")timeslurm=${timeline##*=}IFS=-read-raday_split_time<<<"$timeslurm"no_days_time=${day_split_time[1]}days=${no_days_time:+${day_split_time[0]}}no_days_time=${day_split_time[1]:-${day_split_time[0]}}IFS=:read-rasplit_time<<<"$no_days_time"# Time formats with days: D-H, D-H:M, D-H:M:Sif[[$days]];thenslurm_days="$days"slurm_hours=${split_time[0]}slurm_minutes=${split_time[1]:-0}slurm_seconds=${split_time[2]:-0}# Time format without days: M, M:S, H:M:Selseslurm_days=0if[[${#split_time[*]} == 3 ]]; thenslurm_hours=${split_time[0]}slurm_minutes=${split_time[1]}slurm_seconds=${split_time[2]}elseslurm_hours=0slurm_minutes=${split_time[0]}slurm_seconds=${split_time[1]:-0}fifiecho"Running for $(echo "$slurm_days*1" |bc)d $(echo "$slurm_hours*1" |bc)h $(echo "$slurm_minutes*1" |bc)m and $(echo "$slurm_seconds*1" |bc)s."timeslurm=$(echo"$slurm_days*86400 + $slurm_hours*3600 + $slurm_minutes*60 + $slurm_seconds"|bc)echo"This means $timeslurm seconds."timeslurm=$(echo"$timeslurm *0.9"|bc)echo"Will terminate at ${timeslurm}s to copy back necessary files from scratch"echo""echo""# run ase calculation and timetimeout"${timeslurm}"python3"$AUTOJOB_PYTHON_SCRIPT"exit_code=$?if["$exit_code"-eq124];thenecho" "echo"Cancelled due to time limit."elseecho" "echo"Time limit not reached."fiecho" "echo"### Cleaning up files ... removing unnecessary scratch files ..."echo" "AUTOJOB_FILES_TO_DELETE="*.d2e *.int *.rwf *.skr *.inp"rm-vf"$AUTOJOB_FILES_TO_DELETE"sleep10# Sleep some time so potential stale nfs handles can disappear.echo" "echo"### Compressing results and copying back result archive ..."echo" "cd"${TMP_BASE_DIR}"||exitmkdir-vp"${SLURM_SUBMIT_DIR}"# if user has deleted or moved the submit direcho" "echo"Creating result tgz-file '${SLURM_SUBMIT_DIR}/${JOB_WORK_DIR}.tgz' ..."echo" "tar-zcvf"${SLURM_SUBMIT_DIR}/${JOB_WORK_DIR}.tgz""${JOB_WORK_DIR}" \
||{echo"ERROR: Failed to create tgz-file. Please cleanup TMP_WORK_DIR $TMP_WORK_DIR on host '$HOSTNAME' manually (if not done automatically by queueing system).";exit102;}echo" "echo"### Remove TMP_WORK_DIR ..."echo" "rm-rvf"${TMP_WORK_DIR}"echo" "echo"Extracting result tgz-file"echo" "cd"${SLURM_SUBMIT_DIR}"||exittar-xzf"${JOB_WORK_DIR}".tgzmv"${JOB_WORK_DIR}"/*.rm-r"${JOB_WORK_DIR}".tgz"${JOB_WORK_DIR}"rm"${SLURM_SUBMIT_DIR}/scratch_dir"echo"END_TIME = $(date +'%y-%m-%d %H:%M:%S %s')"# Record job in log fileecho"${SLURM_JOB_ID}-${SLURM_JOB_NAME}"iscomplete:on"$(date +'%y.%m.%d %H:%M:%S')""${SLURM_SUBMIT_DIR}">>~/job.logecho" "echo"### Exiting with exit code ${exit_code}..."echo" "exit"$exit_code"
Edit the brace expansion in line 96 or 101 to change the
files copied to the scratch directory.
Reminder
Don't forget to replace JOB_NAME, SFU_ID, and PYTHON_SCRIPT with
appropriate values in addition to setting your desired SLURM parameters.
Also, if you don't define the path to a Python virtual environment in your
.bashrc file, then you should replace $COMP_CHEM_ENV with the path to
the activate script (usually, path-to-environment/bin/activate).
Launch an interactive job
The following script initiates an interactive session on the cluster to avoid
using the login node.
samples/slurm/InteractiveJob.sh
#!/bin/bash# This script simply initiates an interactive session on the cluster to avoid using the login node.# The session has 4 cores, 12 GB memory, X11 forwarding, and a 1 hour time limit.(cd~/scratch/||exit;salloc--x11--time=01:00:00--mem-per-cpu=3G--ntasks=4--account="$SALLOC_ACCOUNT")