One of the things that's been missing from FAODEL is a tool to help manage resources and launch services. After the EMPIRE release, we did a lot of work to fix this by building a new cli tool that does many different things. The faodel tool can start/stop services, set/remove DirMan resource info, and put/get Kelpie objects from resource pools. We've received approval from DOE to release this as version 1.1906.1 (Excelsior!) at https://github.com/faodel/faodel. Here's the changelog:
Release Improvements
- New faodel-cli tool for manipulating many things
- Gets build/configure info (replaces faodel-info)
- Start/stop services (dirman, kelpie)
- Define/query/remove dirman resources
- Put/get/list kelpie objects
- New example/kelpie-cli script shows how to use
- Support for ARM platform
- NNTI adds On-Demand Paging capability
- NNTI adds Cereal as alternative for serialization
- NNTI has better detection and selection of IB devices
- Fixes
- SBL could segfault due to Boost if exit without calling finish
- FAODEL couldn't be included in a larger project's cmake
- LDO had a race condition in destructor
Significant User-Visible Changes:
- faodel-info and whookie tools replaced by faodel cli tool
- Dirman's DirInfo "children" renamed to "members"
- Faodel now has a package in the Spack develop branch
Known Issues
- FAODEL's libfabric transport is still experimental. It does not fully implement Atomics or Long Sends. While Kelpie does not require these operations, other OpBox-based applications may break without this support.
- On Cray machines with the Aries interconnect, FAODEL can be overwhelmed by a sustained stream of sends larger than the MTU. To avoid this problem, the sender should limit itself to bursts of 32 long sends at a time.