Note on ELZ88

A note on ``The Limited Performance Benefits of Migrating Active Processes for Load Sharing''

by Allen B. Downey and Mor Harchol-Balter.

This paper has been published as U.C. Berkeley Technical Report CSD-95-888. Click here for gzipped postscript.

Abstract

The 1988 paper, ``The Limited Performance Benefits of Migrating Active Processes for Load Sharing,'' by Eager, Lazowska and Zahorjan concludes that migrating active processes for load balancing offers little additional performance benefit beyond that obtained using only remote execution (placement). This result is based on analysis and simulation of a system model that is intended to overestimate the performance benefit of migrating active processes. This report examines the system model used by Eager, Lazowska and Zahorjan and concludes (1) that it does not describe many systems, like networks of workstations, to which its results have been applied, and (2) that it underestimates the potential performance benefit of migrating active processes.

Introduction

Based on analysis and simulation with synthetic workloads, Eager, Lazowska and Zahorjan [EagerLazowskaZahorjan88] claim that ``there are likely no conditions under which migration could yield major performance improvements beyond those offered by non-migratory load sharing...''

This result has been widely cited, and in several cases used to justify the decision not to implement migration or not to use migration for load balancing. For example, [ZhouWangZhengDelisle93] explain, ``Our second design decision is to support remote execution only at task initiation time; no checkpointing or task migration is supported. ... For improving performance, initial task transfer may be sufficient; a modeling study by Eager, Lazowska and Zahorjan suggests that dynamic task migration does not yield much further performance benefit except in some extreme cases.''

ELZ's system model is intended to be conservative in the sense that it overestimates the benefits of migration of active processes and underestimates the benefits of non-migratory load-sharing. In this report we point out that there are, in fact, several ways in which ELZ's analysis and workload model understate the benefits of migrating active processes. We also discuss their system model and its applicability to current systems.

We conclude that the general result of ELZ does not apply to current systems. Elsewhere [Harchol-BalterDowney96] we use atrace-driven simulation to show a wide range of conditions in which migrating active processes provides significant performance benefit. Based on these results, and similar results from simulations [KruegerLivny88] and implemented systems (MOSIX [BarakShaiWheeler93]), we feel that the benefits of preemptive migration in current systems should be reexamined.

Conclusions

The reason for this report is to suggest that the benefits of preemptive migration on current systems may in fact be greater than previously believed. This finding is contrary to ELZ, because:

Under ELZ's system model, non-preemptive migration is able to achieve near-perfect load balance; thus, the additional benefit of preemptive migration is small. But this result may not apply to systems like networks of workstations that do not fit their model. ELZ use a system model in which jobs arrive at a server farm and have no affinity for particular hosts; thus the system can maintain balance by placing arrivals at hosts with low load. In this environment, non-preemptive migration is far more effective than it can be in an environment where jobs arrive at particular hosts and migration by remote execution has significant cost.
ELZ use a workload description that has few short jobs (lifetimes greater than zero and less than one seconds). In [Harchol-BalterDowney96] we observed that short jobs are the primary beneficiaries of preemptive migration; thus ELZ ignore what we find to be a major benefit of preemptive migration --- its effect on the short jobs.
ELZ use a workload description that includes a majority of jobs with zero lifetime. This workload introduces artifacts that make it difficult to apply the results of their model to real systems.

In light of these observations we feel that the benefits of preemptive migration should be reexamined. Several recent systems have chosen to implement preemptive migration for purposes other than load balancing (e.g. preserving autonomy). We would urge the developers of these systems to explore the benefits of load balancing by preemptive migration.