logo
Published on

The Pursuit of Context Engineering

Authors
  • avatar
    Name
    Strategic Machines
    Twitter
context

Changing the Subject

Last July, we published Enabling the Agent with Context Engineering — a foundational piece on why context, not prompting, is the true lever in AI agent performance. The response was strong. The idea landed. Context is king.

A lot has happened since.

The past year produced a flood of research, prototypes, and hard-won production experience that has sharpened our understanding considerably. We now know more about what context should contain, how it should be structured, and — critically — why most agents still fail without it. This post is the update you asked for.

The Wall Everyone Hit

Here's a confession from the industry: most data and analytics agents deployed in 2024 and 2025 didn't work. MIT's State of AI in Business 2025 report put it plainly — AI deployments fail due to "brittle workflows, lack of contextual learning, and misalignment with day-to-day operations."

That's a polite way of saying agents were starving for context.

Companies built agentic workflows on top of their existing data stacks and discovered a hard truth: even the cleanest data warehouse can't answer "what was revenue growth last quarter?" reliably when the underlying definitions, exceptions, and business rules live in Slack threads, spreadsheets, and people's heads. The data existed. The context did not.

"Data and analytics agents are essentially useless without the right context — they aren't able to tease apart vague questions, decipher business definitions, and reason across disparate data effectively." — Jason Cui & Jennifer Li, a16z

The Great Debate: Systems of Record vs. Agents

This failure sparked a productive argument among some of the sharpest minds in enterprise software. Jamin Ball's essay "Long Live Systems of Record" pushed back hard on the "agents kill everything" narrative. His argument: agents don't replace Salesforce, Workday, or SAP. They raise the bar for what a good system of record looks like.

We agree — but Foundation Capital's Jaya Gupta and Ashu Garg go further, and we think they're right:

"Agents don't just need rules. They need access to the decision traces that show how rules were applied in the past, where exceptions were granted, how conflicts were resolved, who approved what, and which precedents actually govern reality."

This is the insight that changes everything. Your system of record captures what happened. It almost never captures why it was allowed to happen. The gap between those two things is where agents break — and where the next generation of infrastructure is being built.

stack

The Two Clocks Problem

Think of every system in your organization as running two clocks simultaneously.

The state clock tracks what's true right now. This is the world that databases, CRMs, and ERPs were built to serve. Fast, queryable, current. A product price, an open opportunity, a headcount number.

The event clock tracks what happened, in what order, with what reasoning. This clock barely exists in most enterprises. Events are ephemeral — they fire and disappear. The reasoning that connected observation to action? It was never treated as data. It lived in meetings that weren't recorded and email threads that were never indexed.

We've built trillion-dollar infrastructure for the state clock. The event clock has been almost entirely ignored.

Context engineering, done properly, is the project of reconstructing the event clock — and keeping it current.

From CRM to CRCG: A Concrete Example

Abstract enough? Here's what this looks like in practice.

Imagine a VP of Sales trying to understand why POCs aren't converting. The traditional approach: add fields to the CRM. POC Start Date. POC End Date. Success Criteria. The AI fills them in from call transcripts. By the third meeting, the "Success Criteria" field has been overwritten twice and contains a one-liner. The state is there. The reasoning is gone.

Now consider a Context Graph approach. In the first meeting, Jim (a salesperson at the prospect) complains about two things: prospecting time and CRM updates. The agent doesn't just log these — it checks them against your product's actual capabilities. Prospecting? You don't do that. Flag it. CRM updates? Strong match. Surface a case study.

In the second meeting, Michael (the manager) reveals the company is targeting an IPO in two years and needs forecasting accuracy to jump from 73% to 90%. The agent links this to your core capability, creates a Success Metric node, and — because Michael is the decision-maker — weights his priority above Jim's.

Now ask the system: What defines success for this deal?

The naive CRM says: "Prospecting, CRM updates, forecasting." A soup of keywords.

The context graph says: "The decision-maker (Michael) prioritizes forecasting accuracy to enable an IPO — this aligns with our core capabilities and is the key success criterion. Prospecting is a mismatch. CRM updates are secondary given Jim's limited influence."

Same data. Completely different intelligence. That difference is context engineering.

Decision Traces: The Asset Nobody Knew They Had

Aparna Dhinakaran of Arize.com frames this with precision: agent traces are not ephemeral telemetry. They are durable business artifacts.

"Business reasoning itself becomes a first-class asset."

When a renewal agent proposes a 20% discount, routes it for approval, and gets it — the CRM records "20% discount." The context graph records everything else: the three SEV-1 incidents that justified the exception, the VP who approved it, the prior precedent it was modeled on. The former is a fact. The latter is understanding.

Organizations that instrument their agent orchestration layer to emit decision traces on every run accumulate something that most enterprises have never had: a structured, replayable history of how context became action. Gupta and Garg call this accumulation a context graph — not a static database, but a living record of decisions stitched across entities and time.

The technical underpinning matters here too. Early assumptions favored vector databases for agent memory. Emerging evidence suggests file-centric architectures with structural embeddings — ones that encode role and relationship rather than mere semantic similarity — are pulling ahead. The shape of a decision matters as much as its meaning.

World Models: The Horizon

Where does all this lead? To something AI researchers call world models — learned, compressed representations of how an environment behaves. Not just what's true now, but what happens when you act.

An agent with a rich context graph doesn't just retrieve past decisions — it can simulate them. Given what we know about how this customer segment behaves, what's the likely response to this pricing change? The context graph becomes a simulator. The quality of your agent's reasoning scales not because the model improved, but because the world model it reasons over expanded.

This is the horizon. And it's closer than most enterprises realize.

Where We Are

The progression is clear:

  1. LLMs — answer from training data
  2. RAG — retrieve text chunks, stuff the prompt
  3. GraphRAG — navigate relationships, richer retrieval
  4. Ontology RAG — structured precision, controlled recall
  5. Context Graphs — decision traces, dynamic world models, true organizational memory

Most organizations are somewhere between steps 2 and 3. The gap to step 5 is not primarily a technology gap — it's an architecture and discipline gap. Context engineering is the bridge.

What Strategic Machines Is Doing About It

We built our agent deployments on the principle that context enables and constrains — and that both matter equally. Our clients benefit not just from agents that know what to do, but from agents that know what not to do, and can explain why they acted as they did.

We are actively helping clients architect context layers: designing decision trace schemas, building dynamic context pipelines, and connecting agent outputs back into the business intelligence layer where they belong.

If your agents are hitting the wall — if the answers feel right but the reasoning feels hollow — context engineering is where the work is.

Let's talk.

SOURCES AND REFERENCES

Enabling the Agent with Context Engineering - Strategic Machines

AI's Trillion-Dollar Opportunity: Context Graphs - Foundation Capital

How Context Graphs Turn Agent Traces Into Durable Business Assets - Arize AI

Long Live Systems of Record - Jamin Ball, Clouded Judgement

RIP to RPA: The Rise of Intelligent Automation - a16z

MIT State of AI in Business 2025