Distributed shared memory (DSM) systems support cost-effective parallel computing on networks of workstations by providing the application with a shared memory abstraction. Conventional DMS systems that support sharing of an untyped memory region are limited to providing only one granularity with good performance. Indeed, DSM systems have been divided into those offering support for coarse-grained sharing or for fine-grained sharing. Coarse-grain sharing systems are typically page-based, and use the virtual memory hardware for access and modification detection. Although relaxed memory models and multiple-writer protocols relieve the impact of the large page size, fine-grain sharing and false-sharing remain problematic. Fine-grain sharing systems typically augment the code with instructions to detect reads and writes, freeing them from the large size of the consistency unit in virtual memory-based systems, but introducing per-access overhead that reduces performance for coarse-grained applications. In addition, these systems do not benefit from the implicit aggregation effect present in the page-based systems.
We have developed a new runtime system that support sharing objects in a safe language between the different computers within a cluster. Our system exploits the key insight that the ability to distinguish pointers from data at run-time enables the decoupling of shared object space from the address space which allows efficient and transparent sharing of data with both fine-grained and coarse-grained access patterns. Like earlier systems designed for fine-grained sharing, DOSA improves the performance of fine-grained applications by eliminating false sharing. Unlike these earlier systems, DOSA's VM-based approach and read aggregation enable it to match a page-based system for coarse-grained applications. Furthermore, its architecture permits optimizations, such as lazy object allocation, which are not possible in conventional fine-grained or coarse-grained DSM systems. Lazy object allocation transparently improves the locality of reference in many applications, improving their performance. Our performance evaluation demonstrates that the new system performs comparably to a state-of-the-art page-based DSM (TreadMarks) for coarse-grained applications, and significantly outperforms TreadMarks for fine-grained applications (up to 98%) and a garbage-collected application (65% for OO7).
In addition, the new system offers many advantages over earlier distributed object sharing systems. In the new system, direct access through a reference to object data is supported, unlike Java/RMI, where remote object access is restricted to method invocation. Furthermore, in languages with suitable multithreading support, such as Java, distributed execution is transparent: no new API is introduced for distributed sharing.
Runtime Support for Distributed Sharing in Typed Languages. Y. Charlie Hu, Weimin Yu, Alan Cox, Dan Wallach, and Willy Zwaenepoel, In Proceedings of LCR2000: the Fifth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers. Rochester, NY, May 2000.