We published this unclassified unlimited release paper, summarizing our experiences in offloading floating-point operations to FPGAs on systems like the Cray XD1.
Abstract
Reconfigurable computing leveraging field programmable gate arrays (FPGAs) is one of many accelerator technologies that are being investigated for application to high performance computing (HPC). Like most accelerators, FPGAs are very efficient at both dense matrix multiplication and FFT computations, but two important aspects of how to deliver that performance to applications have received too little attention. First, the standard API for important compute kernels hides parallelism from the system. Second, the issue of system architecture is virtually never addressed. This paper explores both issues and their implications for applications. We find that high bandwidth, low latency connectivity can be important, but the right API can be even more important.
Publications
- SC06 Paper Keith D. Underwood, K. Scott Hemmert, and Craig D. Ulmer, "Architectures and APIs: Assessing Requirements for Delivering FPGA Performance to Applications", SuperComputing 2006.
We published an unclassified unlimited release (UUR) technical report summarizing our LDRD work investigating how FPGAs could be leveraged as computational accelerators in HPC platforms.
Abstract
Field programmable gate arrays (FPGAs) have been used as alternative computational devices for over a decade; however, they have not been used for traditional scientific computing due to their perceived lack of floating-point performance. In recent years, there has been a surge of interest in alternatives to traditional microprocessors for high performance computing. Sandia National Labs began two projects to determine whether FPGAs would be a suitable alternative to microprocessors for high performance scientific computing and, if so, how they should be integrated into the system. We present results that indicate that FPGAs could have a significant impact on future systems. FPGAs have the potential to have order of magnitude levels of performance wins on several key algorithms; however, there are serious questions as to whether the system integration challenge can be met. Furthermore, there remain challenges in FPGA programming and system level reliability when using FPGA devices.
Publications
- LDRD Report K. Scott Hemmert, Keith D. Underwood, Craig D. Ulmer, and David C. Thompson, "FPGAs in High Performance Computing: Results from Two LDRD Projects", Sandia Technical Report SAND2006-6888.
We published an unclassified unlimited release (UUR) paper.
Publications
- ERSA Paper Craig Ulmer and Adrian Javelo, "Floating-Point Unit Reuse in an FPGA Implementation of a Ray-Triangle Intersection Algorithm", Engineering of Reconfigurable Systems and Algorithms, June 2006.
Presentations
2005-09-29 Thu
fpga net pub bestof
We had a paper reviewed for unclassified unlimited release (UUR) that covered a TCP/IP Offload Engine and OpenGL primitive serializer that I built in FPGA hardware.
Publications
Even though I didn't find a place to publish the paper, I did get it reviewed and approved for external release. Here's a copy of where I was at with it in 2005.
- VizNic Draft Craig Ulmer and David Thompson, "A Network Interface for Enabling Visualization with FPGAs", draft
2005-05-16 Mon
fpga pub hpc
We published an unclassified unlimited release (UUR) paper about our work with the Cray XD1.
Publications
- CUG Paper Craig Ulmer, Ryan Hilles, and David Thompson, "Reconfigurable Computing Aspects of the Cray XD1", Cray User Group (CUG) 2005.
Presentations