The Fields You Choose Not to Extract

The instinct when building an extraction tool is to extract everything. Every field in the document is a field the tool could pull, so the roadmap becomes a race to cover them all, and “supports more fields” feels like unambiguous progress. But not every field is equally extractable, and treating them as if they are produces a tool that’s reliable on some fields and quietly unreliable on others, with no way for the user to tell which is which. The more useful discipline is the opposite of the instinct: deciding which fields the tool should not extract — and being explicit that it doesn’t.

Some fields are clean to extract: they appear in predictable forms, they mean one thing, and reading them correctly is a bounded problem. Other fields are treacherous: they require interpretation the tool can’t reliably do, they’re defined by the interaction of clauses scattered across the document, or getting them right depends on domain judgment that varies by context. A tool that extracts the clean fields reliably and declines the treacherous ones — explicitly, saying “this requires human judgment, here’s the relevant text” — is more useful than a tool that attempts everything and is silently wrong on the hard parts. The second tool forces the user to independently figure out which outputs to trust, which is the work they were trying to avoid.

Choosing not to extract a field is a positive product decision, not a gap. It’s the tool drawing a clear line around what it’s reliable at, and that line is itself valuable information for the user. “I extracted these fifteen fields and I’m flagging these three as requiring your judgment because they depend on interpretation” is a more trustworthy output than sixteen fields presented with uniform confidence, one of which is a guess. The explicit refusal tells the user exactly where to spend their attention, and it signals that the fields the tool did extract were ones it could stand behind. Scope honesty builds trust the same way uncertainty honesty does.

This runs against the demo incentive, which rewards breadth — the more fields a tool fills in, the more impressive it looks in a five-minute walkthrough. But breadth bought by extracting fields the tool can’t do reliably is breadth that fails exactly when a real user leans on it. A narrower tool that’s trustworthy across its whole stated scope beats a broader tool that’s trustworthy across an unstated subset of its scope. The user can build a workflow on the first; the second forces them to keep a mental map of which fields are actually reliable, which defeats the purpose.

The strategic version of this: a tool’s scope should be a promise, not an aspiration. Every field inside the scope is something the tool commits to handling reliably; everything outside is something it explicitly hands back. Defining that boundary well — and having the discipline to keep treacherous fields outside it until you can actually do them — is what makes the tool’s output something a user can rely on without re-checking. Extracting everything is easy to promise and hard to honor. Extracting a well-chosen set, reliably, and being honest about the rest, is the harder and more valuable thing to build.