This is the first paper where I talk about the General-purpose Reliable In-order Messages (GRIM) communication layer that I've been developing for Myrinet. My idea is that we should be building communication libraries that can route data between host processors and remote peripheral devices that are distributed throughout the cluster. We're calling these types of systems "resource rich clusters" because they have more computing resources than other types of systems. The current version of GRIM offloads a lot of the message layer management responsibilities to the network interface, which simplifies the amount of effort required for peripheral devices to communicate in the platform.
Resource rich clusters are an emerging category of computational platform where cluster nodes have both CPUs as well as high-performance I/O cards. These clusters target specific applications such as digital libraries, web servers, and multimedia kiosks. The presence of communication endpoints at locations other than the host CPU requires a re-examination of how middleware for these clusters should be constructed.
A key issue of middleware design is the management of flow control for the reliable delivery of messages. We propose using a network interface based optimistic flow control scheme to address resource rich cluster requirements. We implement this functionality with a message layer called GRIM, and compare its general performance to other well-known message layers. This implementation suggests that the necessary middleware functionality can not only be constructed efficiently, but also in a way that provides additional middleware benefits.