A Children's Treasury of Critiques and Concerns About the Current LLM Hype Cycle

We’re a couple of years into the LLM hype cycle and the volume at this point is deafening. I use LLMs, and I find them interesting, and they have played a major part of my professional life for the past two and a half years. That said, I can’t declare them an unqualified success or a miracle cure for anything. I approach them with a combination of skepticism and grief. I understand they are here to stay and I will do my best to use them effectively in the use cases where they are indeed effective.

Classicube at Work

Problem Space At my new job, my new team engages in a team activity: every other week we get together for 30 minutes to play a game together online. We’re mixed hybrid, meaning about half of us work in office and half of us are grandfathered/exceptioned in to working remotely. this means we can’t play a board/card game in person, but has to be procurable over the internet in a Zoom call.

Trying Out PaperWM

I’m an enthusiast of playing around with different window management schemes. I like the idea of a tiling window manager, but I’m not smart enough to ever get really productive in something like i3 or sway. for the past few years I’ve been using Rectangle on MacOS as something of a band-aid I’ve been using the Tiling WM Extension ion Gnome for a while, and I’ve been pretty happy with it.

A Practical Guide to Self-Hosting a RAG

Post Contents Introduction Goals Anti-Goals First Things First Retrieval-Augmented Generation What The Hell is a RAG? What does this mean? Case Study: The Dumbest Possible Person in the Room Getting the Data A Eulogy for wget Enter Playwright Our Playwright Script Cleaning Up the Data Install Unstructured Processing the HTML into Something More Usable Getting a Locally-Runnable LLM Finding the Right Stuff to Add to Prompts to Get The Thing Doing Semantic Search Getting an Embedding from Text Indexing our Transcripts with Embeddings Inserting text into DB Premature Optimization: Use a Sliding Window Adding Embeddings Kicking the Tires on Semantic Search Trying it Out You Have to Post the Entire Transcript Over and Over Not Very Good End Results Making it Harder Conclusion Things I Could Have Done Better Realize Some Things Are Slow Don’t Be “Cute” With the Source of Data Use Python The Code Introduction As an exercise in self-improvement and understanding what the hell it is I actually do at work, I had promised myself (and, unfortunately, my manager) I’d implement a RAG from conception to finish by Valentine’s Day, 2024.

Job #1 of Codebases Is Onboardability

I got my first job as a software engineer in 2000, so in 2024 I’m pushing a quarter century of being a developer. This is one of the topics I have begun to hold near and dear to my heart. Software engineering is a team sport. If you’re not setting new players up for success, you’re not being the best teammate you could be. Supporting others is just as important as getting your needs met.

Framework Syndrome: Solving Software Problems by Not Solving Them

A common antipattern I’ve been both perpetrator of and victim to is what I describe as Framework Syndrome. This is the act of writing software that does not solve the inherent problem at hand, but provides a poorly designed scaffold in which one is expected to eventually “fill in the blanks.” That is, in the absence of the ability or desire to make the code do what it needs to do, kick the can down the road while at the same time writing mountains of not particularly useful code that will never be fully excised from the codebase.

I'm Not Participating in This Year's Advent of Code for Very Good Reasons

I’m exhausted. I’ve been guilted/peer pressured into participating in the AoC for at least 5 years. I’m tired, I’m defeated, I’m unable to meet its weird expectations consistently and still I remain a competent software engineer. I’m not participating in his year’s Advent of Code. Every year I try and get stymied by day 14 of so because of practical considerations. It Emphasizes Fast Solutions Over Good Ones The entire gamification of the problems is about solving them as quickly as possible.

Keeping Alive With Long-Running Sync Work in Python

In my previous post I wanted to run a keepalive green thread on the wide while doing work to let the coordination framework (in this case just a plain old Postgres database) which workers were around and still processing work. Now, I have a long-running block of synchronous code. Herein lies the problem: synchronous code does not play nicely in the async event loop. It doesn’t take breaks to “breathe.” It doesn’t yield control to a scheduler.

Registering Signs of Life in Long-Running Async Jobs in Python

At work I’m currently working on a fairly large system in which we have a pool of greedy workers, of unknown size, which can opt it at any time to the flow of work. A job is considered abandoned if it is marked as IN_PROGRESS but the worker who has claimed it hasn’t phoned home in sone amount of time. The project is async, which makes things bot more and less interesting.

2023: Linux on the Desktop This Year

I have a Gaming PC I bought from Costco when my wife told me “I should maybe get back into my old hobbies” in the summer of 2020. It came with Win10, which is fine and it’s probably good to have at least one Windows machine in the house at any given time. The thing is: Windows is annoying. Every time Windows ran an update I had to run a third-party uninstall tool to undo the changes Windows updates made to the graphics drivers, which started hanging and crashing if they weren’t the exact specific ones the machine shipped with.

Hardware Review: Anbernic RG35XX

Wanting a small, pocketable handheld for bus rides, etc., I saw the Miyoo Mini and attempted to purchase one. They are incredibly hard to acquire. The Anbernic RG35XX presented itself as an alternative, and it was actually possible to buy one. I got one for about $60 on Amazon and was immediately in love. It had a decent built in game library, decent battery life, and was the perfect size. Not too big, not to small.

VS Code: Dark Terminal on a Light Theme

I’m a weirdo: on my IDEs I prefer a light theme, having used light themed IDEs since time immemorial (still miss using Visual Studio regularly). But I prefer white on black for my terminal emulators, as I have used that since time immemorial and a black on white terminal window doesn’t feel like a serious thing. So here’s what I added to my settings.json to get the best of both worlds (light theme turned on):

Hardware Review: Devterm

I bought a Devterm a couple of years ago and it’s mostly been sitting in a drawer. The good The thermal printer is novel The form factor makes it super portable The bad OS support is not great, at one point I built my own Raspbian image because the official one had drifted so far out of date that the apt repos stopped working The keyboard and trackball (even with flashed firmware) are painful to use

Hardware Review: 1st Gen SteamDeck (512GB)

I bought a Steam Deck. I was on a waiting list that said I wouldn’t get it until this year (2023), but it arrived in October 2022. This thing is so versatile thanks to its Desktop Mode. For games like the boomer shooters I so love, I can dock the Deck and play with a mouse and keyboard and nicer monitor. I can install Flatpaks. The gaming experience is bar none the best.

Hardware Review: AOKZOE A1

I kickstarted the AOKZOE A1 when I was utterly convinced that I would never get a Steam Deck. Joke’s on me; the Deck arrived before the A1. I primarily use this system as my sole Win11 machine, so it does a lot of light desktop work. It is also my “travel” laptop; when I’m travelling I take this and not my Steam Deck. Everything is tunable, but it draws way more power than a Deck and generally lasts about the same amount of time even with the larger battery.

Hardware Review: IPhone 13 Mini

I’m not going into specs or specific details on this device. I was on an iPhone XS, which has slowly been degrading over time. Battery was slowly dying, replaced it at the Apple Store, the ribbon cable for the display was not adequately clipped back into place. It was easier to just order a new phone than to take the bus up to Emeryville again in the middle of the day for an appointment.

Mixed Async Code in Sync Python: Disappointingly Simple

One thing I love about Python’s practical approach to type annotations and enforcement is that it’s gradual: you can rapidly code a large ball of mud and get it working, then refine it to make it safer with typing later on. Chalk this up as another good idea (possibly by accident) for Python: you can do the same with async. At work, someone lamented that threads aren’t quite safe but they needed to do multiple http requests in parallel.

A Case for Match

The Python 3.10 release includes the new match statement, which superficially looks like the case/switch statements in other languages but semantically is closer to pattern matching in Haskell or Rust. Like the walrus operator*, I struggled to find a use case for this and it seemed like a feature that was added just because the language is 30+ years old and all the good new functionality is taken. However, I found a pretty good case for it that used to be a lot more work: duck-typey arguments that make default case rules easy but enable more complex functionality as needed.

Python as a Language Is Inescabably Coupled With Its Implementation Part 2: The Tracer

I was recently discussing some dumb Python tricks at work with some colleagues and showed them this old Gist I wrote, which in modern times I would rewrite to look like this: import functools import inspect import sys @functools.lru_cache def getlines(filename): with open(filename, "r") as file_handle: return tuple(file_handle) @functools.lru_cache def getline(filename, line_number): return getlines(filename)[line_number - 1] def tracefunction(frame, event, arg): if event == "line": info = inspect.getframeinfo(frame) fname, lineno, fn = info.

Async Programming: Understanding It From Fundamentals

This was inspired by a short chat I had with a coworker, trying to give a simple, 15 minute explanation of something that took me a decade to wrap my head around due to poor teaching resources online. Async programming in modern “industrial” languages is shrouded in magic, abstractions, and years of atrocious decisions (looking at you, Javascript/Python). Most tutorials start out with “just mark your function async and await it and use these magic incantations and you’re good to go!

Modern Python: Features I Haven't Used but Plan To

Python has continued to progress and introduce new features and modules. In this post I’ll cover features I haven’t used much (or at all) and how I plan on using or not using them. Walrus Operator I’ve been aware of this for a few years. I’ve found about 3 times where I’ve found it appropriate to use. It’s nice but not a huge change to the way I code. Generally in the pattern

Modern Python Has Changed How I Code

I can’t understate the importance of how much the following have changed and improved the way I write Python and have confidence in its correctness: Continuous Integration Black Dataclasses Mypy Type Hints Continuous Integration This isn’t particularly new to me (or the industry), but a good CI workflow that runs tests and linting on every commit pushed to the repo tracker gives confidence that the code is clean to merge into the main branch.

Python as a Language Is Inescabably Coupled With Its Implementation Part 1: LET'S DO DUMB SHIT WITH the GC

There is a convenient but untrue fiction about Python that the language specification is somehow cleanroom and CPython is actually “just an implementation.” This has always been false, and harmful at best. Look at __dict__. Near every Python object has a dictionary that fuels and consumes it. All your dotted getters are mere passthroughs for dot __getitem__ers. Another fun thing is the leaking of implementation details in Bad Ways. Here’s something you can do but should not do, lest I find out where you live and poop in your mailbox:

__All__ Is a Sacred Space and You Murderous Goblins Are All Profaning It

Let me spell something out for you trickster-meanies: # HELLO I AM thingy.py __all__ = [X, Y, Z] X = True Y = True Z = True Reasonable, right? >>> from thingy import * Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/scheirer/thingy.py", line 1, in <module> __all__ = [X, Y, Z] NameError: name 'X' is not defined Python is older than my son (who is 3) and yet you abuse it.

ZPL-O-Rama Part 5: Postscript

End product Please disregard the poor camera placement or the labels, it has since been fixed. Thinking Back, It Was All So Simple Now we have the system up and running, let’s talk random obstacles and next steps. This is something of an epilogue to the saga, as it’s a list of small things that accrued while working on the project. Networks are slow and the dual tier service thing is bad, actually I initially chose to run a service on the Nginx server that then called to a service running on the RPi.

ZPL-O-Rama Part 4: The Hardware

Hardware The RPi The Raspberry Pi is a (I think) Pi 3 with Wifi I found in the garage with a cheap clear acrylic case. It might have been a RetroPie rig in a prior life? Or one I was “gonna get around to” doing something with and finally did? Then for this project I bought a Raspberry Pi camera and a small acrylic case for it, too. The Printer The printer is a hefty boi, a Zebra something or other.

ZPL-O-Rama Part 3: Software

Running the software Frontend server The frontend has three responsibilities: Display information Perform access control (don’t let strangers on the internet print out jobs) Forward appropriate calls to the backend I’m using Go’s base templates for dynamic content, the Echo framework for the webapp endpoints, and plain old bare bones modern javascripto for the scripting stuff with no polyfills, no build environments, no minifiers, and boy does it feel good.

ZPL-O-Rama Part 2: Concepts and Architecture

Problem Space To automate the process of printing and reporting back a ZPL payload, we need: A way to get the ZPL from the user A way to send the ZPL to a printer A way to take a picture A way to send it back to the user A way to get the ZPL from the user A web service makes sense here. We want an API or a frontend (or both) to send the ZPL along and check its printing status.

ZPL-O-Rama Part 1: A Personal/Work Project (Introduction)

Introduction In my spare time on weekends in between errands and mornings before everyone wakes up, I’ve been working on a little project I’ve been having a lot of fun with: ZPL-O-Rama. The Problem A large part of my employer’s line of business is creating shipping labels, and a large number of those aren’t simply printed images, but printed on very high volume, heavy duty, industrial grade printers using a proprietary language called ZPL.

The Shanling Q1 Media Player

Introduction A while ago I was bored with the Mechanical Keyboard rabbit hole and started looking into other, equally strange rabbit holes to dive into. At around the same time I hit my iPod Classic’s 160GB limit. I’m not yet ready to hack it up to have bigger storage: I plan to keep it in working “original” condition as it may find a better home in the future with some collector who is better at soldering than I.

SFTP Is Still Around and That's OK

So it’s 2021 and about the entirety of my job is integrating third party systems with internal ones, which then reach out to other third-party services. A lot of stuff uses SFTP still. In this day and age anything not on HTTP seems barbaric, but SFTP does have its advantages. SFTP is format agnostic This goes for HTTP as well, but you need to correctly set headers, and there’s a constant, incessant push for change for change’s sake.

You Don't Need to Make That Temporary File, Dude

This was initially a blog post I wrote on my employer’s internal system, but it’s interestingly useful and it doesn’t contain any trade secrets so I figure I’ll share. A common pattern that seems obvious when you need to shuttle data around in file form is to use a temporary file against the filesystem using the tempfile module. You very seldom ACTUALLY need to do this. The BytesIO class follows the exact same protocol, the file protocol, so any API that accepts a “file-like object” will accept an in-memory piece of information in addition to a file on disk.

Trying Out Rectangle

I’ve been using the Rectangle window manager for mac for the last couple of weeks and it’s been the most helpful thing since Mission Control (and setting up a hot corner to activate it). I’ve tried to use full tiling managers before but I’ve found it difficult because 1) Irregularly sized window totally mess up the flow, 2) I am so used to the WIMP paradigm, including moving windows around so I have only been semi-functional with and 3) really weird shortcuts I have to memorize.

The 2020s and the Post-Office World

I moved to the Bay Area 6 years ago after stubbornly refusing to for over a decade before because I wanted to be in the middle of the world of software. 2020 made some of that luster wear off. Our office closed at the end of March, 2020, for what was scheduled to be 2 months, which eventually stretched out into over 6 months, until finally we were told to clean off our desks by Thanksgiving as we were permanently remote.

Pinebook Pro: The Tinkering Laptop

As the parent of an almost three year old, I don’t get much time to myself, and I’ve given up on video games that don’t have playtimes under 15 minutes (that discounts anything with load times or cutscenes). In my spare time I have to find other things to do that are low impact and can be cut into small amounts of time. I’ve taken to watching a lot of retrocomputing stuff on YouTube, which has inspired me to tinker with old software and resource constrained devices.

My Dingus Chromebook Still Works

So I bought that dang Chromebook over three years ago at this point and it keeps chugging on. Google has continued to ship OS updates (which I only notice as weird, arbitrary UI changes) and I can still use it to code but its main purpose now is its new life: ChromeOS runs Android apps pretty well, so I put an SD card full of movies on it and play them via VLC for my kid.

Five Years in the Bay Area

February 8th marks the fifth anniversary of me moving to the Bay Area to work for tech startups. In retrospect it’s been a great experience despite it being the Bay Area. Culture Living in the Inland Empire, I was an hour away from LA and all that culture, but I never bothered doing it. Living 5 miles outside of San Francisco has mean the City is a constant part of my life: restaurants, concerts, museums, lots of things I could have theoretically done more of in SoCal but never bothered to.

Own Your Infrastructure

I’ve been sharecropping on Amazon’s server farms since I moved to the Bay Area 5 years ago. That is, every startup I’ve worked for has utilized AWS (and sometimes GCP or Azure in addition). This started out great for my career because I have not built a server machine from parts out since I was in college and I could use all my developer muscles to be operations person. However, when you’re on-call, you no longer own your uptime.

Python Meetup Talk - 2019-10-09

I gave a short (~10) minute talk on preparing to move to microservices at the Python meetup in San Francisco. The main points: We all start out with a monolith The monolith never fully goes away That’s fine Scope out a new project to make your first microservice Pull out a relatively isolated piece of code in the monolith to make your next microservice

Turn a Chromebook Into the ULTIMATE GOLANG/PYTHON DEVELOPER MACHINE

This is a recycled post from my tumblr weblog Ha ha ha just lying the real title should be Turning a Piece of Shit Chromebook into a Good Enough Development Machine Because You’re Unemployed and Feel Like An Ass Trying to Justify Spending $2000 on a God Damned Macbook so You Wound Up Buying a Chromebook Instead Anyway, I’m unemployed because of reasons and figured there were better things to do with the credit limit on my credit card than spend $2000 on a Macbook, so I bought one of the highest rated Chromebooks at my “willing to pay this much” price point: the Chromebook C100P.

What I Like About My New Job

This is a recycled post from my tumblr weblog. I’ve since had three other jobs, but a lot of what I like still rings true 5 years later in 2020. Please note I am contractually prohibited from saying what I don’t like about most of my prior employers so don’t expect any negative posts. I’ve gone from developer at a large software corporation in the suburbs of Southern California to being a developer at a startup in SoMa in San Francisco.

Stupid Simple API Reference for Bottle.py Web Services

I have a stupid json-only REST API I implemented in bottle.py. This introspects the default app, gives a dumb readout that should act as an adequate reference for discovery: @bottle.route('/') def index(): bottle.response.content_type = 'text/plain' return ("=== API REFERENCE ===\n" + "\n".join(x['rule'] for x in bottle.app().routes))

C++ Is Not So Bad

For as much shit as I like to talk about C++, I sure can get a lot done quite efficiently in it. I read an interview with Bjarne a while back and he said C++’s most important feature was destructors. After thinking about it, yeah, they are pretty awesome and I’ve been using the with statement for the same tightly scoped data lifetime in Python.

I Don’t Really Use OSX Anymore

I have two desktop systems, side-by-side: an Intel Mac Mini and an Intel 21" iMac. The Mini runs Leopard and the iMac runs Ubuntu Karmic Koala, and I find myself completely satisfied with the Linux desktop, and switch back to OSX as an auxiliary rather than as my primary. I started on Debian back in 1999. I wanted to get into Linux, but both Red Hat and SuSE were a little hard to get going for beginners, and the packages supplied were always a weird grab bag of old and new.