Service Node Proxies

2009-05-06 Wed
hpc io pub

We published an unclassified unlimited release (UUR) paper.

Abstract

Partitioning massively parallel supercomputers into service nodes running a full-fledged OS and compute nodes running a lightweight kernel has many well-known advantages but renders it difficult to access externally located resources such as high-performance databases that may only communicate via TCP. We describe an implementation of a proxy service that allows service nodes to act as a relay for SQL requests issued by processes running on the compute nodes. This implementation allows us to move toward using HPC systems for scalable informatics on large data sets that simply cannot be processed on smaller machines.

Publications

Presentations

Feature Characterization Library (FCLib)

2008-11-08 Sat
mesh code pub

We published an unclassified unlimited release (UUR) technical report and received permission to release the software.

Publications

Code

Threading Opportunities in Flash-Memory

2008-09-24 Wed
io pub

We presented this unclassified unlimited release (UUR) poster/paper.

Publications

Presentations

High-Performance Data-Intensive Computing

2008-08-14 Thu
io pub

We published this unclassified unlimited release (UUR) article.

Abstract

Data-intensive problems challenge conventional computing architectures with demanding CPU, memory, and I/O requirements. Experiments with three benchmarks suggest that emerging hardware technologies can significantly boost performance of a wide range of applications by increasing compute cycles and bandwidth and reducing latency.

Publication

Leveraging FPGAs- Architectures and APIs

2006-11-16 Thu
fpga pub

We published this unclassified unlimited release paper, summarizing our experiences in offloading floating-point operations to FPGAs on systems like the Cray XD1.

Abstract

Reconfigurable computing leveraging field programmable gate arrays (FPGAs) is one of many accelerator technologies that are being investigated for application to high performance computing (HPC). Like most accelerators, FPGAs are very efficient at both dense matrix multiplication and FFT computations, but two important aspects of how to deliver that performance to applications have received too little attention. First, the standard API for important compute kernels hides parallelism from the system. Second, the issue of system architecture is virtually never addressed. This paper explores both issues and their implications for applications. We find that high bandwidth, low latency connectivity can be important, but the right API can be even more important.

Publications