2025-02-16 Sun
ai nlp

I hit a weird problem with Google's Gemini 2.0 this morning- I asked it to transcribe some images of hand-written notes into a CSV file and had it swap some of the lines. As someone that does a lot of data engineering, I find this especially dangerous because the output was convincing enough I almost didn't catch the error in the actual numbers. This makes me wonder how much bad data is going to flow into important records in the near future.

Transcribing Power Bills

This morning my wife and I were talking about how our power bills have gone up a bit in the last few years. When I asked if we had this in a spreadsheet somewhere, she said no, but pointed me at 5 pages of handwritten notes she had made over the years that concisely summarized all the data I'd need to make some plots. It occurred to me that this was exactly the kind of thing that AI should be able to do so I pulled up Gemini, typed the below prompt, and proceeded to take pictures of the notes.

I would like to convert 5 pictures of notes about my pg&e
gas bills into a csv file with the following columns: date,
dollar amount, electric amount, gas amount, PG&E code, Human
notes, and conversion notes. The human notes should be any
comments that were written in the entry. The conversion
notes should be any problems you came across while translating
a particular entry.

I then sent a picture of each page, with a statement about how this was the nth page. The response from the first picture was very reassuring: it printed out a CSV file that looked like everything was in the right place. The next few pictures appended more data to the CSV. Gemini said something went wrong on the last two pictures, so I switched to my chromebook, started a new conversation, and repeated the process until I got to the end. It even gave me a link for downloading the CSV file so I didn't have to copy paste it. I moved over to Google sheets, imported the CSV, and started asking Gemini to remind me how to do a pivot in sheets so I could look at the years in a monthly timeline.

Transcription Problem

The Sheets spreadsheet plots had some issues because of missing cell entries, so I went back and started manually inserting values. This is fine, as I'm happy to use any tool that automates 90% of the brute force work. However, as I went through the table, I noticed two problems. First, the gas numbers had some bad values because of my wife's handwriting (eg "g .55" became 8.55). This Brazil-like mistake is still fine, as the errors would be caught with some range checking. The bigger problem was that Gemini swapped a few of the data values in the lines:

A few mistakes

This problem is significant because it's hard to detect. It's also very puzzling because the image has a lot of guides to help keep things aligned: the original text is structured, each line has blue lines to guide the reader, and there isn't anything obvious to me that would trip it up. In fact if I just send the one page, it transcribes the data just fine. That makes me wonder if it's something to do with appending knowledge from one prompt to the next.

Dealing with Djinnis

This kind of problem is what leads to the endless prompt engineering cycle that is what makes AI so difficult to use these days in tasks that require accuracy. Should I just send 5 tasks to transcribe the images and merge them myself? Should I tell it valid ranges for the fields? Should I take a video and scan all the pages in one shot?

Prompt engineering feels like you're dealing with a Djinni that's granted you three wishes. There's always a loophole that wrecks your commands, so you spend more and more time perfecting additional clauses to your prompt to try to convince it to do the thing you want. In the end I'm not sure you can ever be certain that it's not stabbing you in the back in some unexpected way.

New Org Blog Implementation

2024-12-14 Sat

If everything is working right, you should be seeing this website rendered with a new backend that I've had on the back burner for a few years now. It's still missing some features, but I decided to switch over to it so I can move on to other projects. It's written in C++ this time, but still uses Emacs Org mode files for the content.


Over the last 20+ years or I've implemented my website in a few different ways. Back in the 1990's I wrote a few web pages in HTML to talk about the things I was doing in college. HTML was tedious so I wrote some Perl scripts to generate static web pages from plain text files. Writing in plain text really helped speed up the writing, but after a few dozen posts it became a chore to regenerate and upload the content whenever there was a change. The next step was to port the Perl script over to a cgi script that could generate the pages on demand. While people scoff at it now, Perl was a pretty decent language for generating html because it was straightforward to hack together a bunch of regexs to convert chunks of text into HTML templates. It was hard to re-read the code and make sense of it, but since it was an interpreted language you could just iterate on it until things looked right.

The next big step for me was to reformat all the text content into an Emacs Org-mode file. Org mode made it easy to put each post as its own section in a single-file outline. Merging everything into one file meant I could insert extra posts into the timeline without having to renumber everything. Plus, the Emacs spell checker has been great, when I remember to run it. I updated the Perl cgi script to read from the org file and dump it to HTML. Looking back through my posts, I see I made that change 10 years ago.

Needing to Change

The Perl version of this website was ok, but a few things bugged me. First, the HTML looked a little old and didn't do the CSS magic to make the site readable on smaller devices. It was pretty awful on a cell phone because it would shrink several a wide desktop display of content down to a narrow device. Also, Google said at some point that pages that were missing a proper viewport tag (like mine) would be pushed down in their search results. Second, I noticed that generating my webpages with Perl took longer than I wanted. It could have just been the cheap webhosting plan I'd been using, but I suspected that all those complicated regexs I'd written were adding up. Finally, I wanted to do more things with the website but I don't write anything in Perl anymore.

So why C++? Most of my coding these days is in C++ so I had a good idea of how to split the code up into useful classes and implement some of the harder things like string formatting. The main hardship of going with C++ is that you have to recompile the code every time you make a major change. It took a while before I found the right balance in the code. Initially I went all-in on jamming everything except the org-mode content into C++. I wasted a lot of time working out clever ways to have CMake generate custom header files with constants for the different sites I wanted to run (ie, devel at home and production on the site). I later came to my senses and switched to a system that has hooks to read in a template file at runtime, so I could change the look of the website without having to recompile everything.


I don't think the site looks too different, but there are some changes:

There's still a lot missing (eg index and tags pages), but it's good enough that I've switched things over. If you can't find a page you're looking for, flip around some and use the date to find the new post.

Printer Calibration for a Headrest Display

2024-07-07 Sun
3d print

I've had the Anycubic Kobra-2 printer for about a year now. It's been fun to download and print trinkets from Thingiverse (like Nicolas Cage's head), but I've been meaning to get back to printing things that are designed to connect with other things in the real world. For example, one of the neat things Ford has done is release CAD specs to allow you to design accessories that can be plugged into the FITS connectors of their cars. While there haven't been too many designs that use FITS (other than this awesome chicket nugget holder), I appreciate that Ford is trying. While looking around for things to print for my Ford Maverick, I found that someone had made a Nintendo Switch carrier that plugged into the FITS headrest adapter. It seemed like something the kids would like so I downloaded the designs and started printing.

Rob Stu's Nintendo Switch holder for FITS

Calibration Problems

The first headrest adapter print came out fine, but when I went to try it out on my truck I found that the holes were a little small for my posts. Thinking it was just some residual plastic in the hole, I pushed one of the posts through its hole by twisting it through. Given that the twisting weakened the arm a little and there was no way to push the second post through, I decided something was wrong with the dimensions. I used calipers to measure and compare the print and headrest. In addition to the holes being small, I noticed that some features of the print were just a little smaller than round numbers (eg 49.3mm for the adapter block). I'm not great at Blender but some quick measurements on the original model confirmed the print was slightly smaller.

A 20mm block printed at 19.87mm (YYZ for Todd)

Looking around online confirmed that Kobra-2 prints were just slightly smaller than they're supposed to be. To make matters worse, the printer's menu doesn't have any settings to help you scale the steps per mm. I wound up downloading a simple calibration cube so I could manually figure out the right scaling value to use in Prusa. After several iterations, the scaling value I settled on was 100.7%. People say there are some adjustments you can have the slicer insert into the gcode stream, but for now I'm just going to plug this scaling value in on anything that needs to be sized accurately.

Making the Arms Weaker by Mistake

Since the original arms seemed to be a little thin, I loaded up the model in Blender and added a modified cube on each end to make a thicker connector. Printing went fine, but when I started pulling off the support structure I managed to rip apart the connector. I've had bad luck removing PLA supports so I printed the whole headrest again (3 hours) and tried removing the supports more carefully. No luck- the arm broke in about the same spot. I took look at it with magnifying glasses and noticed that the break was pretty clean, almost as if there was an empty spot in the arm. I went back to Prusa to look at the print and found that there was a hole exactly where I'd put the second block.

Started off good, but I wound up making it worse

I'm not sure how I got it wrong in Blender. My guess is that when I added the block I did a binary difference instead of union. I redid the blocks, verified the arms were solid in Prusa, and printed again.

Problems with Ledges

I thought I'd ironed out all the problems when I got to this point, but no- I started getting prints that had one or two major alignment problems (ie, the whole design shifted in the Y by a few millimeters). I don't care how things look, but the shifts were big enough that the switch connector didn't fit all the way into the socket. I had two failed prints with a design that tried to minimize supports and then another that was vertical. I hadn't had problems like this before, so I suspected that some Z adjustments I'd made earlier were to blame. Resetting the Z height to the value where I'd started fixed things.

Everything shifted right twice


So there, after many missteps I successfully printed both the switch holder and the armrest attachment. Was it worth it? Well, no, not really, if the end goal was just to print things from the web. The headrest part took at least 8 tries before I got it right (and at least 3 tries for the holder), each one taking a few hours to grind away. Prints are easy to setup, but that's a lot of wasted plastic.

Mistakes were made

However, I learned a lot from seeing this print all the way through. I now know: designs need to be scaled by 1.007x to be accurate, check the layers there aren't unexpected holes in a design; each first layer island increases the probability of an island popping off and causing failure; and being too high in the Z dimension can lead to shifts that won't happen until later in the print.

Choosing a University

2024-05-15 Wed

For the last year or so a lot of our family's attention has been focused on helping our son figure out what college to attend. He ultimately picked a place where we think he'll be happy, but it was a lot of work sorting out all the details to make a decision. In this post I want to write down some thoughts on what we learned so we'll have a better starting point when we go through this process again with my other son in a few years.

Good Scores, but a Competitive System

My son worked hard in high school and did very well by our standards. He had nearly all A's, a respectable SAT, multiple APs, and a few varsity letters. We discouraged him from taking on more APs because we were worried about burnout and had doubts about the quality of some of the AP teachers in our high school (eg, students getting all A's in one class but then getting a 1 on the AP exam). Other students piled on more clubs and awards than our son, but I'm proud of his results and relieved he found a balance between work and happiness.

Our son didn't have a major picked out, but he was interested in a medium-sized school that was located somewhere interesting and had a path to a UC-level degree. California's school system is complicated, as it divides its schools into three tiers for different needs: the UCs serve "research-bound" students, the Cal State schools focus on "profession-oriented" students, and the Community Colleges serve students that want a stepping stone. While everyone wants to go to a UC (research or not), Livermore's Los Positas Community College was a distinct possibility because it's highly ranked and offers transfers to the UCs after two years.

Unfortunately, there are several things that make the UCs a mess right now. First, many of the UCs are overcrowded and lack space, funds, or willpower to expand. Nearly all the schools are doubling up their dorm rooms. Some don't have dorm space for sophomores and are located in places that don't have any affordable housing options. Second, the UCs are currently test blind, which means that they don't look at your SATs during the selection process. While I believe this makes it slightly harder for wealthy families to game the system (I know someone that paid $7k to have their kids coached on beating the SAT), it also means that everyone is thrown into a big pile without a simple metric to help sort them out. Given the prevalence of ChatGPT use among my son's peers, I also wonder if it makes any sense to have the essays as part of the admission process. Finally, there are a massive number of students applying, with each school receiving between 80k-100k applicants a year. The numbers are pushing acceptance rates down, which means you have to look at each school and gamble on which major will give you the best chance of getting accepted (eg, CS is extremely competitive, but some schools have data science or cognitive science majors that might be easier or better).


Our son applied to enough colleges that I had to make a spreadsheet to keep everything straight. We basically did three waves of applications: UCs, Cal States, and then out-of-state schools. We spent the most time on the UCs because they came first and had the earliest deadline (though it was easy to submit to multiple schools). The Cal States were much simpler and required almost no info (even Cal Poly, which is one of CA's most desired schools). The other schools were last. I had thought we'd apply for more, but my son was burnt out by this point and doubting whether he wanted to go to school at all. The shotgun approach gave mixed results. He received rejections from Davis, Irvine, Santa Barbara, and Cal Poly Slo. He got wait listed at Riverside and San Diego, with the former going positive in the summer and the latter negative just before the Fall.

Here's a top-level summary of the important parts:

| School     | Cost/ | Div |   SAT | Small | Student | Male | Under | Travel |
|            | Year  |     | Upper | Class | Faculty |      | Grads | Time   |
| Chico      | $24k  | 44% |  1186 |   37% |    20:1 |  46% |   13k | 3h     |
| Monterey   | $22k  | 64% |  1263 |   29% |    23:1 |  40% |    6k | 2h     |
| SJSU       | $27k  | 78% |  1370 |   21% |    24:1 |  52% |   27k | 1h     |
| San Marcus | $22k  | 71% |  1100 |   17% |    21:1 |  42% |   13k | 2h     |
| UCSC       | $42k  | 61% |   N/A |   28% |    23:1 |  50% |   17k | 2h     |
| Riverside  | $42k  | 88% |   N/A |   22% |    24:1 |  48% |   23k | 6h     |
| RIT        | $47k  | 31% |  1450 |   48% |    13:1 |  66% |   14k | 8h     |
| St Mary's  | $44k  | 57% |  1260 |   68% |     8:1 |  44% |    2k | 1h     |
| U. Denver  | $43k  | 26% |  1400 |   56% |     8:1 |  44% |    6k | 3h     |
| U. Oregon  | $54k  | 34% |  1370 |   38% |    19:1 |  45% |   19k | 3h     |
| OSU        | $44k  | 31% |  1380 |   28% |    19:1 |  51% |   29k | 2h     |

Some comments:

Scholarships and Deals

The list price for many of the out of state and private schools really made us question whether they would be worth it. However, most of these schools sent scholarships or deals in their acceptance letters that helped drive the cost down. As summarized below, everyone but U of Oregon had a deal that dropped the price down to something that would be competitive with UC costs. These number are consistent with what other people have told us (eg, U of Oregon rarely gives a good-enough discount because they know they're a top choice for Californians).

| School    | List | Discount | Total |
| U Denver  | $79k | $36k     | $43k  |
| St Mary's | $78k | $34k     | $44k  |
| RIT       | $76k | $29k     | $47k  |
| OSU       | $59k | $15k     | $44k  |
| U Oregon  | $66k | $12k     | $54k  |

One special call out here: the Western Undergraduate Exchange (WUE) is a great way to students in western states to save money on some schools in neighboring states. It's a reciprocal program that allows out-of-state students with good grades to get cheaper rates (eg, 1.5x the in-state rate). Oregon State University offered this to our son, but other schools like U of Utah, U of New Mexico, Boise State, U of Hawaii, U Colorado Denver, etc. participate.

In theory, there was an option for us to get in-state tuition in some schools in Texas via my employer. However, the only schools that were appealing were UT Austin and maybe TAMU, and admissions were extremely competitive for out-of-state students (they only allow 10% out-of-state, so the scores are very high). My wife doesn't want to have anything to do with Texas, so it was an easy one to write off.

Watch out for Prison Food

On every tour we took there was some talk of how great the food was. It sounded frivolous to us, but it turns out that food quality can be a serious problem in schools. A friend's son reported that the food at RIT was really bad and that he had gotten sick from it enough times he had to work out alternative options for the meal plan required by his dorm (seems common). People are reporting that universities are looking for cheaper food options, which means they sometimes work with companies that also serve prisons. Interestingly, we read that UCSC students protested bad food service from Sodexo in 2004 and got the school to replace it with UCSC Dining.

Tours and Advisement

We visited a lot of different types of campuses over the last few years. Too many, actually. We saw about 10 schools in CA, 3 in Boston, 2 in Oregon, and 2 in Georgia. While many of the visits were more of an excuse to wander around somewhere while we were on vacation, the trips wore all of us out. Most campus tours are a complete waste of time. They may tell you what each of the buildings is for, they rarely tell you why you should be interested in the major.

The University of Oregon really stood out in terms of selling the school. The visitor center had sample dorm rooms you could explore without having to crawl over other parents in a campus tour. They also had a small theater with a tile display for presenting an overview of the campus. Before the session started, they showed nature videos from around the state and listed the drive time to get to each place. It was clever of them to sell Oregon to the (mostly Californian) parents- I can see how it would be fun to come visit and go explore the rest of the state. It made UC Davis's video "In the middle of.. Everywhere!" seem extremely flimsy.

Oregon State University's orientation also impressed me. One thing they did was help us schedule a one-on-one session with an admissions advisor to talk about majors and the exploratory studies program. The advisor talked about how they had a path for students to try out different majors before committing to one. Interestingly, exploratory students at OSU often wind up graduating on time and before other students because the courses count towards a degree and students often lock in better when they're presented with options and are given a chance to choose on their own. I was impressed that a large school would have people available to meet with prospective students. At the end of the session, I wanted to go there myself.

Don't Believe the Rankings

The last comment I'd like to make about our college search is that there's a lot of phoniness going on in all directions. The same way that students game the admission process, schools game the college ranking systems. For example, UChicago is famous for blasting out mailings to every student it can find (our son got his first flier in 9th grade). It does this so more people will apply. With a fixed number of slots, this means that the school appears more selective, which in turn boosts the school in some of the college ranking sites. We noticed that other schools like Northeastern seem to be following the same plan (see this article). The college ranking systems themselves have their own biases. US News lists 19 factors they use in their rankings, each with its own weighting factor. Some of these factors are really useful for some people, but not so important for our situation (eg, first-generation graduation rate).

The Fiske guide (above) did turn out to be a good reference for sorting schools out. It was easy to get overwhelmed by the number of schools that are listed in it, but their star ranking system seemed to match my expectations for schools I knew about. One feature that was really useful for us was the "overlaps" box, which listed schools that were similar to the one you looked up. This info helped us draw comparisons between schools (eg, WPI is like Rose-Hulman but near Boston).

In any case, I think our son is happy with where he decided, and we're relieved to be done with the whole process. Now all that needs to be done is for him to do the hard work and for us to cough up the money.

Anycubic Kobra-2 FDM Printer

2023-06-18 Sun
3d print

A few years ago I bought a 3D resin printer so the kids and I could learn a little bit more about modeling and fabricating 3D objects. While it's been a great experience, we haven't printed much in the last year because of all the headaches of dealing with resin. Every time we do a print we have to deal with temperatures, level the plate, put on all the safety gear, and then clean up everything at the end. It's a lot of overhead and dangerous enough I don't want my kids doing it when I'm not home. I've been thinking it would be nice to have a traditional FDM printer on hand to lower the barrier for printing simple things so that printing will be more accessible to my kids. After a lot of internet wandering, I decided to get the new Anycubic Kobra-2. It's new, works with Linux, and shipped from Amazon with a 1KG spool of filament for $300.


The kids and I setup the Kobra-2 on my desk in the garage. The assembly wasn't too difficult, although it took us a while to figure out how to hold the frame so we could get some of the machined screws lined up properly. It was also a little unclear how the feeder tube was supposed to go in the header (does this go any farther in?). Once it was setup we ran the auto calibration tool to probe the height of the build place. Auto calibration was a required feature for me, and one of the reasons why I'm happy to be buying a printer after the technology has had a chance to mature. We then preheated the filament and had it print the famous 3DBenchy boat design. The kids and I watched with wonder as the extruder spun around the plate with robot brrrrr noises. FDM printing is so much more exciting to watch than resin because you really see it happen. With resin the plate moves up and down every few seconds, with an upside-down design that's coated in excess resin. While you add a whole layer at a time, it takes a long time to get through all the pads and supports before you get to your actual design.

Sample Prints

3D Benchy only took 30 minutes to print out. One of the other selling points of this printer is that it can do higher speed prints (150mm/s to 250mm/s, compared to the 60mm/s of the stock Ender printers). I was really tempted to get one of the $600 Bambu printers, which can do up to 500mm/s, but decided we should start with a basic printer and see how much we like it first. Benchy came out looking pretty good, though you can see some pixelation in the windows that I don't think you'd have in resin. That's fine though- I think I'm more interested in building functional widgets with this printer than detailed figures.

The next thing we printed was a small mesh cup I pulled from thingiverse. This design came as a plain STL object so I had to load it into a slicer to render to gcode. Anycubic says to use PrusaSlicer, which is a powerful slicer built for Prusa printers. It's free and has a Linux version that worked on my Chromebook's Linux container. I had to download the settings from the Anycubic support site, but they came up fine. For this design I just loaded the cup, hit slice, and saved the gcode. Prusa had a lot of detailed info about how it built the object. I liked that it recognized the interior and autofilled it with a grid to save on material. The scaled down version of the print took about an hour to build (correctly predicted by Prusa). I was impressed that the printer was able to build a thin mesh and have it come out ok (though later I broke it trying to trim some of the base).

Next up was a micro-sd card holder. I found a clever design someone had made that had a radial container with a screw-on lid. The threading is really interesting to me because it gives you a way to connect parts together (someone also modified the design so you could screw together multiple micro-sd containers, though I doubt I'll ever fill this one). The parts I printed screwed together just fine. Two of the slots weren't deep enough, but that's ok. I should have added an up label though, as the slots don't have enough friction to keep cards in place if you open it upside down.

Finally, I printed a baby guardian dragon dice holder from Thingiverse for my niece. This design has a spot for you to put a die. It's a cute design, though the FDM version resulted in a bunch of lines on the angled surfaces.


We have had a few issues with the Kobra-2 during our first week of use. My son had a few failed prints that we're trying to figure out. The printer would get partway through the base of the design, get stuck, and then go into an endless calibration loop. It's possible this is because we installed a newer version of the slicer than we were previously using. When I went back and sliced the design with my chromebook it printed fine. Again, it's nice that the setup/cleanup for a print is so easy. The other main issue has been quality. The FDM prints look good, but they're not as detailed as the resin prints. Below are some zoom-ins that show how this results in the FDM prints coming out jagged in certain spots.


One thing I've noticed about the FDM printer is that it the motors really get a beating, zig zagging back and forth all the time. Our house doesn't have great wiring, so the lights in the garage (and bathroom) flicker slightly when the printer is bouncing. Also, there's a spike in power when you start up because it needs to warm up the build plate and nozzle. Maybe I'll look into getting a battery or power conditioner for the plug to smooth out the signal.


Overall, I'm pretty happy with the Kobra-2 so far. After dealing with all the resin printing pains it's been a breeze to get FDM working. I don't think we'll print a ton of things, but it's nice to have the option to design and build stuff when we want.