Craig Ulmer

Data Warehouse Becomes FAODEL

2018-01-12 faodel

It's official: I'm renaming my main project at work to FAODEL: Flexible, asynchronous, data-object exchange libraries. FAODEL (pronounced fā-ō-del) comes from a simplification of the Gaelic term faodhail, which is a land bridge used to cross between islands. Here are two examples between the Monach Islands in Scotland:


Nessie, Kelpie, and Scottish Names

My main project at work for the last few years has been writing data management services for HPC applications. Sandia's I/O group previously built an RDMA portability layer called Nessie to support the Lightweight File System (LFS). I initially built a key/value store on top of Nessie. I wanted to keep the Scottish monster theme going, so I decided to call it Kelpie (a kelpie is a sea horse in Scottish mythology that drags people to a watery death). As Kelpie evolved we started adding more packages with Scottish/Gaelic terms. We named our memory manager Lunasa (a Gaelic harvest festival) and our boot services became Gutties (a cheap gym shoe in Scotland). It didn't take long for us to realize that there were a lot of issues with using Scottish/Gaelic terms to name things. First, the words are often difficult to spell and pronounce. Second, we've had trouble finding other Scottish mythical beasts we could swipe. And finally, it seems like every Scottish word has a slang meaning that would make us hesitant to use it at a conference. As such, when we talk about our different software packages, we've been referring to them by our project name, which is "Data Warehouse".

ATDM "Data Warehouse" Origins and Problems

Three years ago the labs realized that they needed to do something different if they wanted to be able to have their codes scale up to exascale computing platforms. The ATDM project was formed to develop new software infrastructure that would allow new codes to achieve better performance than MPI-based approaches. The main idea was to use overdecomposition and task-dag programming models (aka, "asynchronous many-task" or AMT) to overcome dynamic load balancing problems while improving developer productivity. Existing frameworks (e.g., Charm++, Legion) didn't fully meet our requirements, so the DARMA team set about building a new AMT API that leveraged modern C++ features and could be retargeted to run on top of different runtimes (eg, DARMA on Charm++). From an I/O perspective applications needed a way to allow dynamically-placed tasks to exchange data with existing mesh databases and storage tools (all of which are built on static distributions). Our project was started to serve as a way to manage AMT's data in this context. Given that other AMT's used the term "Data Warehouse" for their storage, we became ATDM's "Data Warehouse".

The problem with the term Data Warehouse is that it has a specific meaning for I/O people. In the 1970's Bill Inmon started using Data Warehouse to refer to the idea of centrally storing/indexing all of an organizations data, instead of spreading it out among many smaller databases. Inmon has written articles that point out how NoSQL people have hijacked his term (which I agree with), so I've always cringed at having to refer to ourselves as ATDM's DW. It's difficult to change a funded program's name though, once it's on the books.

Faodail and Faodhail

Recently, we've been reorganizing our code so that we can go through the official open-source release process. We've generalized our scope so we can serve more than just DARMA, so I decided it was a good time to revisit project names. I found the name "Faodail", which people say is a "lucky find, usually of a lost item". That seemed like a good fit for a key/blob service, so we started using it (it even made its way into a paper an intern wrote). There were a few problems with faodail though: (1) we found it was difficult to pronounce, (2) the internet already had some references to it, (3) taking the "aod" out of "faodail" makes it fail, and (4) google translates faodail to "maiden" (??). It seemed like a bad idea.

I went back to the naming game and searched some more. After a lot of misses (and general disgust from my team), I noticed in WikiSource that the term right after faodail is "faodhail". The definition for faodhail is:

faodhail, ford, a narrow channel fordable at low water, a hollow in the sand
          retaining tide water: from N. vaðill, a shallow, a place where straits
          can be crossed, Shet vaadle, Eng. wade.

Looking around more, I found maps of different faodhails around Scotland. These are regions where the tides go out and leave a land bridge that lets you cross between the islands. This meaning is perfect for what our software does: we build communication libraries that let you move data between different application islands.


Choosing FAODEL

Faodhail was longer than Faodail and had the same problems. I finally realized that what I needed was to ditch the actual name and just make an acronym that fixed the problems. Changing the name that was shorter and more phonetic helped me a good bit. I finally worked out some words that fit: flexible, asynchronous, object data-exchange libraries (FAODEL). It's not great, but the words do relate to what we're doing. Once I convinced myself it was the right thing, it was easy to tell the team what I wanted and have the confidence to make it stick.