Scale in Distributed Systems

   page       BibTeX_logo.png   
B. Clifford Neuman
Thomas Lee Casavant, Mukesh Singhal (eds.)
Readings in Distributed Computing Systems, pages 463-489
IEEE CS Press, Los Alamitos, CA, USA

In recent years, scale has become a factor of increasing importance in the design of distributed systems.
The scale of a system has three dimensions: numerical, geographical, and administrative.
The numerical dimension consists of the number of users of the system, and the number of objects and services encompassed.
The geographical dimension consists of the distance over which the system is scattered.
The administrative dimension consists of the number of organizations that exert control over pieces of the system.

The three dimensions of scale affect distributed systems in many ways.
Among the affected components are naming, authentication, authorization, accounting, communication, the use of remote resources, and the mechanisms by which users view the system.
Scale affects reliability: as a system scales numerically, the likelihood that some host will be down increases; as it scales geographically, the likelihood that all hosts can communicate will decrease.
Scale also affects performance: its numerical component affects the load on the servers and the amount of communication; its geographic component affects communication latency,
Administrative complexity is also affected by scale: administration becomes more diffcult as changes become more frequent and as they require the interaction of different administrative entities, possibly with conflicting policies.
Finally, scale affects heterogeneity: as the size of a system grows it becomes less likely that all pieces will be identical. 

This paper looks at scale and how it afects distributed systems.
Approaches taken by existing systems are examined and their common aspects highlighted.
The limits of scalability in these systems are discussed.
A set of principles for scalable systems is presented along with a list of questions to be asked when considering how far a system scales.