The batch job scheduling system implemented on this system uses SLURM. SLURM is responsible for resource management, job scheduling, and monitoring.
Fairshare Scheduling Policy
We implement the SLURM Fairshare feature to provide a fair utilization of the available resources. This is accomplished by allowing historical resource utilization information to be incorporated into job feasibility and priority decisions. This is normally the most significant component of a job's priority, which ultimately defines the position of the job on a queue. We do not use a FIFO (First-In-First-Out) scheduler. Your jobs' priority will be determined by your utilization over the past seven days (sliding window), with high utilization resulting in lower priority for new jobs.
Backfill Scheduling Policy
This is a scheduling optimization which allows SLURM to make better use of available resources by running jobs out of order. Using job data such as walltime and resources requested, the scheduler can start other, lower-priority jobs so long as they do not delay the highest priority jobs. Because of the way it works, essentially filling in holes in node space, backfill tends to favor smaller and shorter running jobs more than larger and longer running ones.
Automatic Queue Routing
Each of our compute resources has a pre-defined default queue. If you submit your job without specifying a queue, your job will be automatically routed to the default queue. Therefore, be aware of which queue you intend for the job to run in and specify this queue in your SLURM batch script.