“Tripping over the Potholes in Too Many Libraries”, 2020-08-09 (; backlinks; similar):
While on my most recent break from writing, I pondered a bunch of things that keep seeming to come up in issues of reliability or maintainability in software. At least one of them is probably not going to make me many friends based on the reactions I’ve had to the concept in its larval form. Still, I think it needs to be explored.
In short, I think it’s become entirely too easy for people using certain programming languages to use libraries from the wide world of clowns that is the Internet. Their ecosystems make it very very easy to become reliant on this stuff. Trouble is, those libraries are frequently shit. If something about it is broken, you might not be able to code around it, and may have to actually deal with them to get it fixed. Repeat 100 times, and now you have a real problem brewing.
…When you ran it [the buggy library], it just opened the file and did a write. If you ran it a bunch of times in parallel, they’d all stomp all over each other, and unsurprisingly, the results sometimes yielded a config file that was not entirely parseable. It could have used
flock()or something like that. It didn’t. It could have written to the result from amktemp()type function and then usedrename()to atomically drop it into place. It didn’t. Expecting that, I got a copy of their source and went looking for the spot which was missing the file-writing paranoia stuff. I couldn’t find it. All I found was some reference to this library that did config file reading and writing, and a couple of calls into it. The actual file I/O was hidden away in that other library which lived somewhere on the Internet…The only way to fix it would be in this third-party library. That would mean either forking it and maintaining it from there, or working with the upstream and hoping they’d take me seriously and accept it.…It seems to boil down to this: people rely on libraries. They turn out to be mostly crap. The more you introduce, the more likely it is that you will get something really bad in there. So, it seems like the rational approach would be to be very selective about these things, and not grab too many, if at all. But, if you work backwards, you can see that making it very easy to add some random library means that it’s much more likely that someone will. Think of it as an “attractive nuisance”. That turns the crank and the next thing you know, you have breathtaking dependency trees chock-full of dumb little foibles and lacking best practices.
Now we have this conundrum. That one library lowered the barrier to entry for someone to write that tool. True. Can’t deny that. It let someone ship something that sometimes works. Also true. But, it gave them a false sense of completion and safety, when it is neither done nor safe. The tool will fail eventually given enough use, and (at least until they added the “ignore the failed read” thing), will latch itself into a broken state and won’t ever work again without manual intervention.
Ask yourself: is that really a good thing? Do you want people being able to ship code like that without understanding the finer points of what’s going on? Yeah, we obviously have to make the point that the systems should not be so damned complicated underneath, and having to worry about atomic writes and locking is annoying as hell, but it’s what exists. If you’re going to use the filesystem directly, you have to solve for it. It’s part of the baggage which comes with the world of POSIX-ish filesystems.