To submit a job:
The system will automatically pick the least privileged queue that has at least as many resources as you requested in your script.
If the queue picked requires special validation, you will need to get permission, the first time only, to use that queue. Email email@example.com .
Check on the status of your submitted job using qstat, or, on redwood, qstat or qstat2.
- Estimate the resources, viz. CPU time, # of processors, and amount of memory (except on redwood) that your job will need.
- Determine the PBS resource options that you will need to specify in order to request the resources your job needs.
- Create a shell script to run your job. In addition to invoking your executable, this script might remove temporary files, redirect input and output. Importantly, the PBS resource options you determined above may be included as #PBS directives at the top of this shell script, and thereby reused across multiple runs. See -l resource_list in PBS Batch Script Options. Also see our Example script.
- Submit the job to PBS using qsub.
How PBS runs a Job:
- Assume you are a funded researcher, and have submitted a job to the LM-defR queue. (Large Memory, Default Runtime). Further assume that there were already 4 other executing jobs in this queue, and another 4 queued, or in-line, waiting to run. Of those four, two are owned by funded researcher accounts, and the other two by unfunded research accounts.
- Your job will be queued (put "in-line") in the LM-defR queue you requested, provided you have, at one time previously, requested and been granted authorization (special validation). Your job will be given an initial priority by the PBS scheduler. Since you are a funded researcher, your priority will be higher than it would have been were you a nonfunded user. The other jobs in the queue will have each have a priority that is some function of it's initial priority (funded or nonfunded) and the cumulative time each has been in line. You will be scheduled (inserted in line) based on your priority, with respect to the other jobs already in line. (Note that unfunded jobs will always be placed at the end of the line, but will move up in line as other jobs finish, and will gradually become strongerat holding their place in line against subsequently submitted funded jobs, as their in-line time increases, driving up their priority.)
- When one of the four running jobs in the LM-defR queue finishes, the PBS scheduler will start running the choose another LM-defR queued job to run. The chosen job will be the next one in-line, according to its accumulated priority. Your job will then move up one place in line.
- Eventually, as queued jobs are run and completed, your job's time will come, and the PBS scheduler will submit/run your job with the indicated resource authorizations. Your job will now be one of the four LM-defR jobs that are in a running state.
- While your job is running, the Irix scheduler will mediate it's access to the CPU and other resources, according to it's own rationale, which will have nothing to do with your research status (funded or nonfunded), or your PBS queue priority. Provided your program does not try to use more than 4 processors, does not ask for more than 4 GB of memory, and does not accumulate over 288 hours, your job should eventually complete. If your program requests more resources than provided by the LM-defR queue, it will be killed by the OS.
- Mimosa's PBS configuration does not respect funded status over other statuses. Everyone is equal on mimosa, in terms of getting their jobs into a run status.
A few Tips:
- If you overestimate your resource needs, your job will have to over-wait in the queue.
- If you underestimate your resource needs, your job will run out of resources and die.
- Funded researchers' accounts will be setup so that their jobs are initially more likely to be run. However, nonfunded jobs which have been waiting for some time will gradually accumulate priority which will allow them to compete for run status with recently-submitted funded jobs
- On redwood, for jobs submitted to the 2nd PBS instance (usually
4 processors and above), dynamic cpusets have been configured, which means that you will automatically be granted exclusive access to all memory local to those CPUs (just under 1GB per processor), and so you should not explicitly request a memory amount in your PBS job or g03sub script, since doing so may result in either: 1. Your using less memory than you have been allocated, or 2. Your being granted more processors than you requested/needed.
<< Previous Next>>