Craig Ulmer |
SmartNICs Project Final ReportIn December we finished our three-year, ASCR-funded "Offloading Data Management Services to SmartNICs" project. One of our deliverables was to write a final report that consolidates what we learned into a single report. This 144-page (!) report includes sections from our proposal and previous papers, and examines using SmartNICs from multiple perspectives. There are three new topics in this report that we haven't covered before:
AbstractModern workflows for high-performance computing (HPC) platforms rely on data management and storage services (DMSSes) to migrate data between simulations, analysis tools, and storage systems. While DMSSes help researchers assemble complex pipelines from disjoint tools, they currently consume resources that ultimately increase the workflow's overall node count. In FY21-23 the DOE ASCR project "Offloading Data Management Services to SmartNICs" explored a new architectural option for addressing this problem: hosting services in programmable network interface cards (SmartNICs). This report summarizes our work in characterizing the NVIDIA BlueField-2 SmartNIC and defining a general environment for hosting services in compute-node SmartNICs that leverages Apache Arrow for data processing and Sandia's Faodel for communication. We discuss five different aspects of SmartNIC use. Performance experiments with Sandia's Glinda cluster indicate that while SmartNIC processors are an order of magnitude slower than servers, they offer an economical and power efficient alternative for hosting services. Publication
Presentations
|