2002-02-06 Wed
fpga net pub
After getting GRIM up and running, I did a lot of work hooking it up to differerent peripheral devices. For the Novel uses of SANs workshop I wrote a paper that descrives how to build a GRIM communication endpoint for FPGA accelerators, and how users could set up a computational pipeline for the devices with some simple forwarding tables.
Abstract
This paper explores the view that the SAN network infrastructure can be an active computational entity capable of supporting certain classes of data intensive computations effectively during communication. The performance is achieved via the use of Field Programmable Gate Arrays (FPGAs) in the network interfaces (NIs). This paper describes the programming model and the design of a prototype hardware/software implementation using commercial FPGA devices coupled with Myrinet. An active messages style of programming is used to support application-transparent, dynamic reconfiguration of the FPGA hardware to accommodate different computations over time. Performance evaluation of this implementation quantifies the overheads and sources of performance improvement.
Publications
- Novel SANs Paper Craig Ulmer, Chris Wood, and Sudhakar Yalamanchili, "Active SANs: Hardware Support for Integrating Computation and Communication". HPCA Workshop on Novel Uses of System Area Networks.
Presentation
This paper I wrote for the first Myrinet User Group meeting provides an example of how GRIM interacts with an I2O peripheral device. It also goes into the details of how the NI uses some cut-through optimizations to boost the bandwidth for sends in the system.
Abstract
Resource rich clusters are an emerging category of clusters of workstations where cluster nodes comprise of modern CPUs as well as high-performance peripheral devices such as intelligent I/O interfaces, active disks, and capture devices that directly access the network. These clusters target specific applications such as digital libraries, web servers, and multimedia kiosks. We argue that such clusters benefit from a re-examination of the design of the message layer to retain high performance communication while facilitating the interface to endpoints for a variety of devices.
This paper describes a message layer design which includes optimistic flow control, the use of logical channels, a push-style cut-through injection optimization, and an API supporting cluster-wide active message handler management. The goal is to support a number of diverse cluster hardware configurations where communication endpoints exist in a variety of locations within a node. The current implementation has been tested on a Myrinet cluster with communication endpoints located in the host CPUs as well as Intel i960 based I2O server cards.
Publications
- MUG Paper Craig Ulmer and Sudhakar Yalamanchil, "A Messaging Layer for Heterogeneous Endpoints in Resource-Rich Clusters", Myrinet User Group.
2000-08-01 Tue
wsn data pub
The initial three years of my PhD were funded through a NASA Graduate Student Research Program fellowship. One of the benefits of this program was that they encouraged students to come out during the summers to learn more about the problems that NASA faces. I spent my 1999 and 2000 summers in Pasadena, working in the Center for Integrated Space Microsystems. While I initially had plans to evaluate how well my GRIM software worked with one of their clusters, my center director asked if I'd be interested in trying something a little different. He told me that the recent success with the Sojourner rover had sparked a great deal of interest in deploying more in situ sensors on Mars. He challenged me to think about engineering problems NASA would face if it were to cast hundreds to thousands of wireless sensor nodes across Mars.
It was very different than the PhD topic I had been studying, but it was too interesting to pass up. I dove into the papers to learn what people had been doing with wireless sensor networks on Earth. As a networking person, the part that interested me the most was figuring out how a collection of low power, low performance sensor nodes would boot up, establish a routable network, and then collect meaningful information over a geographic region. There were many examples to draw from on Earth, including battlefield sensors, buoy networks, and arctic tumbleweed sensors. My director introduced me to people from all over the lab to learn more about how NASA builds resilient embedded systems that are designed to survive being dropped out of the atmosphere into an environment with harsh thermal constraints.
While the NASA summers took me off course into a side topic that delayed my graduation, it was one of the best things I did during my academic career because it encouraged me to think about hard problems that were outside of my comfort zone. I wrote up the below technical report summarizing some of the things I learned, though I never got it officially entered for a report number at JPL or Georgia Tech. I put the paper up on my school web page, which would up getting referenced more than I would have thought.
Publications
- Summer Report Craig Ulmer, Sudhakar Yalamanchili, and Leon Alkalai "Wireless Distributed Sensor Networks for In-situ Exploration of Mars". NASA JPL Summer Internship Report.
This is the first paper where I talk about the General-purpose Reliable In-order Messages (GRIM) communication layer that I've been developing for Myrinet. My idea is that we should be building communication libraries that can route data between host processors and remote peripheral devices that are distributed throughout the cluster. We're calling these types of systems "resource rich clusters" because they have more computing resources than other types of systems. The current version of GRIM offloads a lot of the message layer management responsibilities to the network interface, which simplifies the amount of effort required for peripheral devices to communicate in the platform.
Abstract
Resource rich clusters are an emerging category of computational platform where cluster nodes have both CPUs as well as high-performance I/O cards. These clusters target specific applications such as digital libraries, web servers, and multimedia kiosks. The presence of communication endpoints at locations other than the host CPU requires a re-examination of how middleware for these clusters should be constructed.
A key issue of middleware design is the management of flow control for the reliable delivery of messages. We propose using a network interface based optimistic flow control scheme to address resource rich cluster requirements. We implement this functionality with a message layer called GRIM, and compare its general performance to other well-known message layers. This implementation suggests that the necessary middleware functionality can not only be constructed efficiently, but also in a way that provides additional middleware benefits.
Publications
- PDPTA Paper Craig Ulmer and Sudhakar Yalamanchili, "An Extensible Message Layer for High-Performance Clusters". Parallel and Distributed Processing Techniques and Applications.
After passing the prelimary exams to enter the Ph.D. program at Georgia Tech, the next major step is passing the qualifying exam. The purpose of the qualifying exam is to show that a student can walk into a new topic, learn everything they can about it, write a reasonable summary of the current state of the art in that topic, and present the material to a committee, all within one month's time. I picked a committee of people that I knew were tough but fair and hoped the best in topics. I was relieved when they asked me to focus my attention on using FPGAs to enable a field of computing called configurable computing.
I spent a coniderable amount of time that month downloading papers and looking up conference proceedings in the library. While I'd worked with FPGAs some during my masters degree, I didn't know the ideas went all the way back to the 1960's (I actually found Gerald Estrin's fixed+variable paper in print at the library). I learned a good bit about the commercial chip families and marveled at some of the ideas people were talking about with custom computing machines. There's nothing better than having an excuse to set aside to learn all you can about an interesting subject. It took a bit of work to get a handle on how to summarize the material I'd covered and put it into a presentable form that had a hard page limit.
My review committee included three computer architecture professors and two network professors that had a background in hardware. Of the five, Vijay was the only professor that worried me. I'd taken a rapid prototyping class with him earlier and had seen him rip into a few students when we had to take turns presenting other peoples' research papers. Fortunately, I realized before it was my turn to do a class presentation that he was only vicious because he wanted to get to the truth of an idea. When he raised issues during my class presentation, I defended the topic and didn't back down, which I think he respected. However, I wasn't sure how well this confidence would carry me in my own exam.
In the end things went pretty well. Dave probed me on some parallelism questions, which ironically Vijay answered for me thinking Dave simply didn't understand the material. They sifted through the cookies I'd brought, asked me to leave for the closed discussion, invited me back in, and then told me I'd passed like that. It was a huge relief and a boost to my confidence.
Publications
- Qualifying Exam Craig Ulmer, "Configurable Computing: Practical Use of Field Programmable Gate Arrays".
Presentation