Large parallel computers are usually shared by many users. Two mechanisms that support this sharing are
The advantages of space-sharing are a dedicated system image (a cluster behaves like a small dedicated system), low overheads (preemption is expensive) and high parallel efficiencies (since jobs tend to run on fewer processors). One disadvantage is that RTC scheduling sometimes allows short jobs to sit in queue waiting for long jobs, resulting in poor average turnaround times. The systems we are studying, the Intel Paragon at the San Diego Supercomputer Center and the IBM SP2 at the Cornell Theory Center, are configured as space-sharing systems using RTC scheduling.
Many programs running on these systems are malleable, meaning that they can run on a range of cluster sizes. At present few if any of these jobs can change cluster sizes dynamically; once they begin execution, they do not yield or allocate processors. In this context, internal resource selection means choosing a cluster size for each job before it begins execution.
When a job arrives at the head of the queue, it may allocate some or all of the processors that are currently free, or it may wait for additional processors to become available. If it chooses to wait, we assume for now that the other jobs in queue must also wait. In other words, we assume first-come-first-served (FCFS) queueing. In future work, we will relax this assumption and address more general queueing policies.
In summary, we use the following model of computation:
Our approach to internal resource selection is to allow each job, when it arrives at the head of the queue, to choose the cluster size that minimizes its turnaround time, which is the sum of its queue time and run time. To make this decision, the job must know its run time on a range of cluster sizes, and predict the queue time until each cluster size is available.
In [4] we develop a speedup model that predicts jobs' run times; in this paper we present techniques for predicting queue times. We make these predictions in two steps: first, we develop a conditional lifetime model that predicts the remaining lifetime of a job based on its current age. Then we aggregate these job-by-job predictions into a queue time prediction.