Go to SCD News table of contents

Go to UCAR home pageGo to NCAR home pageGo to SCD home pageGo to SCD News home pageGo to SCD internal pagesGo to SCD News home pageGo to Features archiveGo to News archiveGo to Tips archiveGo to Updates archive SCD News > News item: December 1, 2000

Job scheduler changes on blackforest, babyblue may affect you

A new factor -- allocation threshold -- affects when jobs will run

blackforest

The IBM SP RS/6000, blackforest

After users submit their jobs to the LoadLeveler batch system, the scheduler determines the order in which jobs are run. Numerous factors determine which jobs will run next, including:

  • Priority of the job class
  • Time when the job was submitted
  • The job's request for number of nodes plus amount of time
  • The project (proposal id) under which the job was submitted
  • The allocation threshold for the project under which the job was submitted

This article briefly describes how two allocation thresholds now affect the order in which jobs run, and the best queue to select.

Changes in the job scheduler

With the increasing number of jobs on blackforest, SCD changed the job scheduler on 4 December 2000 to manage jobs from projects and divisions that exceed their allocations. The process by which queued jobs are selected for execution on blackforest and babyblue has also been changed. These changes affect all users, both community computing users and Climate Simulation Laboratory (CSL) users.

For projects and groups with a monthly allocation (including NCAR divisions and CSL proposals):

  1. When a project has exceeded one allocation threshold (either its one-month cutoff limit or its three-month cutoff limit), it is flagged and any jobs submitted under that project will run after jobs with non-flagged projects.
  2. Projects that exceed both allocation thresholds (one-month cutoff limit and three-month cutoff limit), are flagged in a different way. These jobs will run after jobs with non-flagged projects and after jobs submitted under projects that have exceeded only one threshold.

Jobs from flagged projects (projects exceeding either one or both thresholds) stay in the queue to which they were submitted and are charged the same number of GAUs as other jobs in that queue, but they run only after all jobs from non-flagged projects have run in that queue.

The system sends email to the user to notify them when a job they submit is going to be delayed because their project has been flagged.

GAUs charged are based on the queue the job was submitted to; there is no longer any charging benefit for jobs that started in a higher-priority queue and completed in the standby queue. You must submit jobs to the standby queue to get the benefit of no charges on blackforest.

Note: You can employ two strategies to avoid charges for jobs from flagged projects.

  1. After your project has been flagged, you can submit jobs to the standby queue to avoid charges. The turnaround can be slower, but no charges will accrue.
  2. Before your project exceeds its one-month or three-month allocation, select lower-priority work to be run in the standby queue and save GAUs for higher-priority work that will need to be completed before the end of the time period.

Note: Standby jobs still generate MSS charges in the normal fashion.

Projects with a lifetime allocation (including university projects and some joint projects) will not be able to run jobs once they exceed their lifetime allocation, as in the past. University projects may request additional resources by contacting Ginger Caldwell at cal@ucar.edu or 303-497-1229.

Finally, the round-robin fairness scheme used by the job scheduler has been modified on blackforest and babyblue to take into account the number of CPUs used by jobs.

A more thorough description of job scheduling on blackforest and babyblue is provided on the LoadLeveler job scheduler page.

In 2001 these scheduler changes will be applied to the other SCD compute servers.

For more information

SCD News   ||  UCAR  ||  NCAR   ||   SCD   ||   Contact us   ||  Search
NCAR is managed by UCAR and sponsored by the National Science Foundation