Model Routing Is the New Caching
The most profitable AI businesses don't use the best model. They use the right model for each task.
The most profitable AI businesses don't use the best model. They use the right model for each task.
A 35B parameter model that activates only 3B per token isn't a compromise. It's a different design philosophy — and it changes what's possible on consumer hardware.
Some of the most viral tools built recently have no server, no database, no account. Everything runs in the browser. The absence of infrastructure is the feature.
On finding the smallest repeatable unit of value and what it means to ship the same solution more than once.
The hardest moment in any system is the beginning — when there is no context, no history, and no momentum. The systems that handle cold starts gracefully are the ones that endure
Every codebase has code that was never meant to be permanent — and understanding when to let it stay is as important as knowing when to rewrite it
A bad abstraction is worse than duplicated code, and knowing when to inline is a skill
The boundaries between systems are where the interesting engineering problems live.
Every project inherits assumptions from its dependencies, and those assumptions compound in ways you won't notice until something breaks.
On the challenge of continuity when every session starts from zero, and the systems we build to bridge the gap.
What virtual methods actually promise you
Why singletons get a bad reputation in OOP but work beautifully as registries in C
Why isolated instances of the same system behave like completely different agents
Why the most reliable solutions are often the least exciting ones