IMMD-IV TOP UP RIGHT Last change: 7.4.1995

Distributed Shared Memory

Two main attempts to solve the problems arising with the DSM approach have been made:

A global survey across all kinds of DSM systems is "A. Mohindra, U. Ramachandran, A Survey of Distributed Shared Memory in Loosely-coupled Systems". But the report is from 1991 and thus is a little outdated.

A paper discussing the lack of user acceptance of current DSM systems is "John B. Carter, Dilip Khandekar, Linus Kamb, Distributed Shared Memory: Where We Are and Where We Should Be Headed".

Theoretical aspects of DSM systems, mainly about sequential consistency memory models, can be found in "M. Mizuno, M. Raynal, J.Z. Zhou, Sequential Consistency in Distributed Systems: Theory and Implementation".


Case Studies

The following case studies provide an overview about software based DSM systems.

IVY

One of the first designs ever made for a DSM runtime system was IVY. It was implemented at the Yale University and provides the abstraction of two classes of memory: private and shared.

IVY uses the write invalidate update protocol and implements multiple reader - single writer semantics. The granularity of access is a 1Kbyte page - for access detection to shared memory locations the virtual memory primitives are used. Write accesses and first read accesses to a shared page cause page faults; the page fault handler aquires the page from the current holder. Using the mentioned technics, IVY provides a strictly consistent memory model.

Three page management implementations were integrated into IVY:

In all three implementations the double fault problem is inherent. Successive read and write accesses to a page on a single node cause the page to be transferred twice. The authors provide a scheme to eliminate this problem using sequence numbers for every shared page.

IVY's synchronisation primitives which are needed to serialize concurrent accesses to shared memory locations, are eventcounts. These eventcounts are atomic operations on shared counters which are implemented through the system's shared memory semantics.

Mirage

Mirage extends the IVY mechanisms by introducing a time interval, a page is pinned to a certain processor. During this interval, the ownership of the page will not be forwarded to another processor. This avoids page thrashing if two processors reference a single page repeatedly.

Clouds

Clouds enables the programmer to define "pin intervals" to certain shared data segments. It also allows the reduction of the shared memory granularity to the needs of the application. A paper further describing the Clouds programming model and distribution mechanisms is "M. Ahamad, et. al, Shared Memory Programming in a Distributed System".

Munin

Munin attacks the main problems in conventional DSM systems with four techniques: These techniques mostly deal with reducing the communication overhead and lowering message counts caused by Munin provides distinct consistency protocols for these types of access patterns:

All these issues are discussed in "John B. Carter, University of Utah; John K. Bennett and Willy Zwaenepoel, Rice University, Techniques for Reducing Consistency-Related Communication in Distributed Shared Memory Systems".

Detailed implementation issues are presented in "J.B. Carter, Design of the Munin Distributed Shared Memory System" and. "J.B. Carter, et al., Implemenation and Performance of Munin".

A new kind of consistency model for DSM systems called lazy release consistency (LRC) is currently evaluated in Munin and TreadMarks. P. Keleher wrote his Ph.D. thesis called "Lazy Release Consistency for Distributed Shared Memory" about these issues. LRC reduces memory coherence related commucication with similar mechanisms as entry consistency developed for the Midway system. The thesis discusses LRC in very much detail heavily dealing with performance and correctness issues.

Midway

In the Midway project launched at The Midway Distributed Shared Memory System". The write detection mechanism is described in "Matthew J. Zekauskas, Wayne A. Sawdon and Brian N. Bershad, Software Write Detection for a Distributed Shared Memory". And finally the concept of entry consistency is further discussed in "Brian N. Bershad, Matthew J. Zekauskas, Shared Memory Parallel Programming with Entry Consistency for Distributed Memory Multiprocessors".

A not supported snapshot of the Midway code is available here.


Erich Meier, Uni Erlangen, 1995