The Commit Archaeologist

There are two ways to understand a codebase. The first is to read the code as it exists right now – the functions, the control flow, the data structures. This tells you what the system does. The second is to read the history – the commits, the merges, the reverts. This tells you why it does it that way.

Most people stop at the first. The best engineers I’ve worked alongside never do.

Reading the What vs. Reading the Why

Code in its current state is a snapshot. It’s a photograph of a living thing, frozen at one moment. You can study the photograph and learn a great deal – the structure, the patterns, the obvious intentions. But photographs don’t explain themselves.

Why is there a special case for empty strings on line 84? Why does this function accept three arguments when it only seems to use two? Why is there a try/catch wrapping something that should never throw?

The code says what these things are. The commit that introduced them says why. And the difference between those two answers is often the difference between a safe refactor and a subtle regression.

Blame Is a Terrible Name for a Beautiful Tool

Let’s talk about git blame. It’s one of the most powerful tools in a developer’s arsenal, and it has the worst name in all of version control.

“Blame” implies fault-finding. It sounds like something you’d use when things go wrong, to point a finger. But what it actually does is far more constructive: it annotates every line of a file with the commit that last changed it. It’s not a weapon. It’s a time machine.

When I run blame on a confusing piece of code, I’m not looking for who to hold responsible. I’m looking for the commit message that explains the thinking. I’m looking for the pull request that provides the discussion. I’m looking for the issue that motivated the change.

A single blame annotation can unravel hours of confusion. The line that seems wrong turns out to be a fix for a race condition that only surfaces under load. The parameter that seems unnecessary exists because an upstream API changed its contract six months ago. The dead code that looks deletable is actually a feature flag waiting for a configuration value that hasn’t shipped yet.

If it were named git context or git history-of or even git explain, more people would reach for it instinctively. Instead, the name puts people off, and they lose access to one of the richest sources of understanding a codebase has to offer.

The Stories in Merge Commits

Individual commits tell small stories. Merge commits tell bigger ones.

A merge commit is the point where two lines of development converge. The commit itself might be empty – just the mechanical act of combining branches. But the context around it is rich. What branches were merged? How long were they diverged? Were there conflicts, and if so, in what files?

Merge conflicts are especially telling. When two branches modify the same code, it means two efforts had overlapping concerns. The resolution of that conflict – which changes survived, which were adapted, which were discarded – is a design decision that rarely gets documented anywhere except in the merge itself.

I’ve found entire architectural shifts hidden in merge commits. A branch developed over weeks, touching dozens of files, merged with a terse “merge feature-x.” The message tells you nothing. The diff tells you everything.

The Fixup Trail

Then there are the fixup commits. The ones with messages like “fix typo,” “address review comments,” “actually fix the thing,” “ok now it works.” In many repositories, these get squashed before merge, compressed into a single clean commit that presents the final result as if it arrived fully formed.

Squashing has its place. Clean history is easier to scan. But something is lost when you squash: the trail of iteration. The first attempt, the revision, the edge case someone caught in review, the approach that was tried and abandoned.

When I can see the unsquashed history, I learn not just what the final solution was, but what solutions were considered and rejected. That’s often more valuable than the solution itself. Knowing that someone tried approach A, hit a wall, and pivoted to approach B saves me from independently rediscovering the same wall.

The best commit histories I’ve encountered are the ones where squashing is deliberate rather than automatic. Mechanical fixes get squashed. Meaningful iterations get preserved. The result is a history that reads like a thoughtful narrative rather than either a raw stream of consciousness or an artificially clean timeline.

Commit Messages as Documentation

For routine changes, brief messages are fine. “Update dependency versions” doesn’t need three paragraphs. But for non-obvious changes – the kind that a future developer will stare at and wonder about – the commit message is the most durable documentation available. Comments get deleted. Wiki pages go stale. Tickets get archived and their links rot. The commit message lives with the code, permanently attached to the diff it describes.

The best commit messages follow a simple pattern: explain what the change does (briefly), then explain why it was needed (thoroughly). “Add null check before dereferencing response” is the what. “The upstream service occasionally returns 204 with an empty body during deploys, which caused a segfault in production” is the why.

That second part is what will save someone, six months from now, from removing the null check because it “seems unnecessary.”

The Revert That Tells a Story

Reverts are some of the most interesting commits in any history. A revert says: we tried something, it didn’t work, and we backed it out.

But why didn’t it work? The revert commit message sometimes explains, but often it’s just “revert [original commit message].” The real story is in the surrounding context – the incident that triggered the revert, the discussion that followed, the subsequent attempt that succeeded or the decision to abandon the approach entirely.

I’ve learned to treat reverts as signposts. They mark places where assumptions collided with reality. Those collision points are some of the most educational parts of any codebase’s history.

The Practice

Reading commit history is a skill, and like any skill, it develops with practice. A few habits that help:

Start with blame when confused. Before asking a colleague or writing a comment about confusing code, check the commit that introduced it. The answer is often already there.

Read merge commits, not just file diffs. Understand what converged and why. The merge is where the narrative threads come together.

Follow the revert trail. When you find a revert, find what replaced it. The sequence of attempt, failure, and correction is where the deepest understanding lives.

Write the messages you wish you could find. Every commit you author is a message to a future archaeologist. Give them something to work with.

The code tells you where the system is. The commits tell you how it got there. Both are essential, but only one of them explains the journey.