The Data Room
There’s an assumption embedded in most discussions about AI and document processing: the hard problem is reading. If you can get the AI to accurately extract information from a document, the rest follows.
This assumption is wrong, and the error explains why AI implementation often falls short in high-stakes professional workflows.
What the Data Room Actually Contains
In any complex transaction, the due diligence process generates an accumulation of documents. Contracts, assessments, reports, surveys, compliance filings, financial records, expert analyses, correspondence. In a large transaction, this can reach into the thousands of items.
Each document exists within a context. A physical assessment report means something different when the property is in a flood zone. An environmental report means something different when the transaction involves assumed liabilities. A financial projection means something different when the market comparables suggest the assumptions are aggressive.
The documents don’t announce their context. A professional reading them brings that context. The documents interact — what one says modifies how you read another. The synthesis is where the judgment lives.
The Reading Problem vs. the Synthesis Problem
AI has largely solved the reading problem. Information extraction from structured and semi-structured documents has become reliable. The accuracy numbers are good enough for production use. The documents can be read.
The synthesis problem is different in kind, not just degree.
Synthesis requires knowing what to look for. In a professional assessment workflow, the questions aren’t generic — they’re specific to the asset type, the transaction structure, the regulatory environment, the risk tolerance of the parties involved. A professional doesn’t summarize documents; they answer a set of implicit questions that the documents collectively address.
Synthesis requires understanding how documents relate. Document A establishes a baseline. Document B modifies it. Document C introduces a risk that doesn’t appear in either A or B but follows from their interaction. The synthesis isn’t the sum of the documents — it’s the pattern across them.
Synthesis requires translating findings into decisions. A list of facts extracted from five hundred documents is not useful to a decision-maker. What is useful is a coherent picture of what those facts mean for the decision at hand, what the risks are, and what needs further investigation.
Why Volume Changes the Problem
At low document volumes, a professional can hold the synthesis in working memory. Read the documents, take notes, build the picture.
At high volumes, this breaks down. The professional can no longer read everything with full attention. They have to triage, sampling the documents and hoping the important signals surface. This is where things get missed.
The standard solution is to add more staff — more reviewers, more time, more cost. AI offers the possibility of a different solution: complete coverage of the full document set, with the synthesis still performed at the level of judgment that the professional would apply if they had unlimited time.
This is the product that doesn’t exist yet in most high-volume professional domains. Not a document reader — those exist. Not a summarizer — those exist. Something that applies domain-specific judgment across the full data room and produces output that a decision-maker can act on.
The Hardest Part
The hardest part of building this product isn’t the AI. It’s encoding the judgment.
What questions should be answered? What signals are important in this domain? What’s the difference between a finding that requires flagging and one that’s routine? How do findings from different document types interact?
These questions have answers. Professionals answer them constantly in their work. The answers are implicit in the professional’s training, experience, and institutional knowledge. The product that encodes those answers explicitly — as structured prompts, evaluation criteria, synthesis frameworks — is the product that closes the gap between AI reading and AI judgment.
That encoding work is not glamorous. It requires spending time with the professionals who do the work, understanding the workflow from the inside, and translating expertise into structure. It’s consulting as much as it is engineering.
But it’s also the moat. A general-purpose AI can read the documents. Only a domain-specific product can synthesize them the way a professional would.