Skip to content
Essay

Context Changes Everything: Two Living Examples of What AI Becomes When Intelligence Is Integrated

By Josh Shepherd16 min read
On this page

In July 2025, MIT's Project NANDA published The GenAI Divide: State of AI in Business 2025 — a study of 52 executive interviews, 153 leader surveys, and 300 public AI deployments. The headline number is now well known: 95% of enterprise generative AI pilots returned no measurable P&L impact, despite $30–40 billion in collective spend.

The reflexive read of that number is that AI is overhyped. The more careful read — and the one the report itself pushes — is that the failure is not in the models. It is in what surrounds them.

Generic AI tools, the report observes, are remarkably useful for individuals because of their flexibility, but they stall in enterprise contexts because they "don't learn from or adapt to workflows." Most deployed systems do not retain feedback, do not adapt to context, and do not improve over time. Meanwhile, 90% of workers quietly use personal AI tools to do their jobs, while only 40% of companies have official subscriptions. The shadow AI economy is thriving. The official AI strategy is failing. Same model. Different context.

That gap between the shadow and the strategy is the subject of this article. The 95% are not failing because the model is wrong. They are failing because the model has nothing to stand on. There is no integrated body of intelligence behind the prompt — no voice, no corpus, no relationships, no operational memory — so the model does what it does best in a vacuum: produce confident, generic, average output.

This article works through what changes when that vacuum is filled. It does so through two living examples — a movement leader and a nonprofit — and shows, side by side, what AI produces under three conditions:

  1. A naked prompt — the way most people use ChatGPT today.
  2. A context-laden prompt — what a thoughtful operator can do by hand.
  3. A full agentic context with file search — what becomes possible when intelligence has been integrated into a real system: voice, corpus, data, relationships, and workflows.

The first two are familiar. The third is what the 5% are doing.


Part 1: Why Context Is the Real Variable

Before the examples, a precise framing of the problem.

Large language models are, to a first approximation, completion engines. They take what they are given — a prompt, a system message, retrieved documents, tool results — and produce the most plausible continuation. If the input is a thin question, they produce a thin answer drawn from the average of the internet. If the input is rich and grounded, they produce something specific, defensible, and useful.

This means the gap between a useless AI deployment and a transformative one is almost never the choice of model. It is the surface area of the context provided to that model. The MIT report calls this the "GenAI Divide" — the chasm between organizations that have wrapped AI in workflow, memory, and grounded data versus those that have not.

For knowledge organizations — leaders building bodies of work, nonprofits building donor and program intelligence, churches building disciple-making systems — this divide maps almost perfectly onto the integration problem. You cannot give AI what your organization has not yet gathered. You cannot ground a model in a corpus that exists only as PDFs in a folder, transcripts on a podcast feed, and a CRM that does not talk to anything else.

Said simply: AI exposes the integration debt that fragmentation has been hiding. When an organization moves from fragmented to integrated, the same model produces dramatically different output — not because the model has changed, but because the surface area of grounded context has expanded by orders of magnitude.

The two examples below make this concrete.


Part 2: The Movement Leader

Consider a senior missiologist with thirty years of writing, teaching, and movement-building behind him. Fifteen books. Frameworks like mDNA, APEST, and metanoia that have shaped a generation of practitioners. A Jewish-South African background that informs nearly everything he writes about marginality and the prophetic edge. Two organizations he founded — one a training network, one a publishing and movement house. Hundreds of talks. A voice that is unmistakable to anyone who has read him: gravitas without bombast, theological density carried by accessible image, a recurring move from biblical text to systems thinking to embodied practice.

Now consider three different attempts to use AI to extend his work.

Use Case A: A reader trying to understand a framework

A pastor in Indonesia is trying to apply APEST in his church plant. He has read one of the books but is fuzzy on how the apostolic and prophetic functions actually interact in a young community. He asks AI.

Naked prompt to ChatGPT:

"Explain APEST and how the apostolic and prophetic gifts work together in a new church plant."

The output is a competent but flattened summary. It correctly names the five gifts from Ephesians 4. It offers generic application notes that could have been written by anyone who skimmed a Wikipedia page. It does not distinguish the leader's particular reading of APEST from other treatments. It does not surface his crucial argument that APEST is a latent intelligence in the body, not a personality assessment. It conflates apostolic with "church planter" — a reduction the leader has spent two decades pushing back against. The pastor walks away with a misformed mental model that will quietly distort his decisions for years.

Context-laden prompt to ChatGPT:

The pastor pastes in three pages from one of the books and asks the same question. The output improves measurably. It now uses the leader's actual phrasing in places. It picks up the key distinction between gift and function. But because the pasted excerpt is partial, the answer still misses the argument's spine — the leader's claim that the prophetic function exists primarily to protect the apostolic from drift, which only becomes clear when you see how the argument unfolds across two different books written eleven years apart. The answer is better. It is still not faithful.

Full agentic context with file search:

Now imagine the same question routed to an assistant that has been built around the leader's integrated corpus: every book, every published article, transcripts of every public talk, the leader's own voice and style guide, a graph of how concepts relate across works, and a clear separation between his canonical positions and his more exploratory thinking. The assistant performs file search across the corpus, retrieves the most relevant passages from three different books and one talk, weights them by canonicity, and produces an answer that:

  • States the leader's actual definition of APEST, in language that matches his published voice.
  • Distinguishes apostolic-as-function from apostolic-as-personality, citing the specific chapter where he makes the distinction.
  • Surfaces the apostolic/prophetic relationship as he frames it — the prophetic as the conscience that keeps apostolic mission from becoming colonial expansion.
  • Translates the framework into the pastor's specific Indonesian church-plant context using the leader's own incarnational logic.
  • Cites its sources so the pastor can read further.

The pastor walks away formed, not just informed. He has, in effect, had a tutorial with the leader — at three in the morning, in a language and context the leader has never personally encountered. The leader's voice scaled without distortion.

Use Case B: The leader writing a new piece

The leader himself sits down to draft a chapter on movement renewal in post-Christendom contexts. He has the spine of an argument but needs to weave together threads from three previous books, a recent conversation with a colleague, and an unpublished essay sitting in his drafts folder.

Naked prompt: Useless. The model has no idea what he has already said, what he has retracted, what he is currently testing. Anything it produces is either generic missiology or a hallucinated version of his thinking. He cannot use a single sentence.

Context-laden prompt: Better. He pastes in the relevant chapters and the unpublished essay. The model can now produce a passable draft in something approximating his voice. But it cannot reach the colleague's conversation, cannot pull from the talk he gave last spring that crystallized a key idea, and cannot remember the editorial decisions he made with his publisher about which terminology to retire. The draft requires significant rewriting. The leader does most of the actual thinking, with AI as a faster typist.

Full agentic context: The assistant has the entire corpus, including the unpublished drafts and the talk transcripts. It knows which terms the leader has retired (and substitutes the current ones), surfaces a passage from the colleague's recent interview that maps onto the argument, and proposes three structurally different drafts — each grounded in actual source passages with citations. The leader chooses one, edits, and produces a chapter in a fraction of the time it would have taken without the system. The voice is his. The thinking is his. The system did not replace his judgment; it gave his judgment a much larger working memory.

Use Case C: An emerging leader building on the work

A thirty-five-year-old church planter in Kenya wants to teach the leader's frameworks to her cohort, but adapt them for an East African oral-learning context.

Naked prompt: Generates a generic PowerPoint that distorts the frameworks and contains theological errors the leader would never make.

Context-laden prompt: Produces something more accurate, but the planter has no way to know what is faithful and what is invented. She has to verify every line against the books she has not had time to read.

Full agentic context with file search: Generates a culturally adapted curriculum that explicitly draws on the leader's argument that frameworks must be re-incarnated in each context (citing chapter and verse from the actual corpus), maintains theological fidelity, surfaces specific passages where the leader has discussed African church contexts, and flags the few areas where the leader has not written and the planter is therefore in genuinely original territory. The curriculum is hers. The grounding is his. The next generation inherits the work intact rather than as folklore.


What Becomes Possible When the Leader's Intelligence Is Integrated

Walk a leader through these three tiers and the use cases proliferate. Once the integration work is done, the same grounded context unlocks all of the following at near-zero marginal effort:

  • Faithful Q&A at scale. Readers, students, and practitioners can interrogate the work in their own language and context, with the leader's actual answers.
  • Drafting partner. The leader writes faster, with full recall of his own corpus and a working memory larger than his own.
  • Editorial guardrails. The system flags drift from established positions, terminology changes, and internal contradictions before publication.
  • Curriculum generation. Cohorts, courses, and reading pathways can be assembled from the corpus with proper sequence and citation.
  • Translation with fidelity. Material is translated into other languages and oral cultures without losing the argument's spine.
  • Succession. The leader's frameworks remain accessible, queryable, and applicable long after his speaking calendar slows.
  • Network leverage. Endorsers, collaborators, and movement allies can build on the work with shared grounding rather than improvised recollection.

None of these are possible without integration. All of them are inevitable with it.


Part 3: The Nonprofit (Anonymized)

Now consider a midsize, mission-driven nonprofit. Call it Riverbend. Sixty staff. A donor base of several thousand. A program portfolio that spans direct service, leadership formation, and field partnerships across multiple regions. An institutional history of about forty years and a deep bench of stories, frameworks, and operational knowledge — most of it living in three places simultaneously: a CRM, a shared drive, and the heads of about a dozen long-tenured staff.

Riverbend has, over the past two years, done what most thoughtful nonprofits have done. They built a donor dashboard. They added an AI assistant to it. They have a dashboard view, a pipeline view, a connections graph, a prospect profile, an insider directory, and an upload tool for ingesting outside lists. The system is real. It is in production. The AI panel sits in the corner of every page.

The question is what that AI panel can actually do — and the answer depends entirely on how much of Riverbend's intelligence has been integrated into its working context.

Use Case A: Donor research before a major-gift conversation

A development officer has a meeting tomorrow with a prospective major donor. She has thirty minutes to prepare.

Naked prompt to ChatGPT:

"Help me prepare for a meeting with [Donor Name], a potential major donor for our nonprofit."

The output is generic meeting-prep advice. Open with rapport. Listen more than you talk. Have a clear ask. None of it is wrong; none of it is useful. The model has never heard of this donor, does not know which of Riverbend's programs would resonate with her giving history, and cannot tell whether she is connected to anyone already in the network. The development officer ignores the panel and goes back to googling.

Context-laden prompt:

The development officer pastes in the donor's public bio, a few notes from the CRM, and a paragraph about the program area she is hoping to match. The output is meaningfully better. The model now offers donor-specific talking points and a draft agenda. But it does not know that this donor sat on a board with Riverbend's chair fifteen years ago, does not know that her previous gift to a peer organization was conditioned on field-based reporting, and does not know that one of Riverbend's program directors went to graduate school with her brother. The prep is competent. It is not strategic.

Full agentic context with file search:

Now imagine the same panel grounded in Riverbend's full integrated context: the CRM data, the connections graph that links donors to insiders to staff, the prospect-research history, the program inventory written in Riverbend's actual voice, the institutional story library, and the policy memos that govern how Riverbend handles restricted gifts.

The development officer asks the panel to prepare for the meeting. The agent, in a single composed response:

  • Pulls the donor's record and giving history from the CRM.
  • Surfaces three warm connections through the insider directory (including the board overlap with Riverbend's chair fifteen years ago and the program director's grad-school link).
  • Cross-references the donor's known philanthropic interests with Riverbend's current program portfolio and recommends the two programs most likely to resonate.
  • Generates three distinct conversational openings, each grounded in real Riverbend stories from the story library.
  • Drafts a one-page brief in Riverbend's voice, including a proposed ask range calibrated against her giving history at peer organizations.
  • Flags one risk: Riverbend has had a prior relationship with this donor's family foundation that ended awkwardly, per a memo in the institutional archive. The agent surfaces the memo so the development officer is not blindsided.

What changed is not the model. What changed is the surface area of grounded context. The development officer walks into the meeting with the institution behind her, not a search box.

Use Case B: A program director writing a board memo

A program director has to draft a board memo on a strategic shift in one of the field programs.

Naked prompt: Produces a generic memo template. Useless.

Context-laden prompt: Produces a more relevant draft, but in a voice that is almost-but-not-quite Riverbend. It uses words Riverbend does not use ("stakeholders," "synergies"). It frames the decision in business terms when Riverbend's board has, for forty years, framed strategic shifts in mission terms. The director rewrites most of it.

Full agentic context: The agent has Riverbend's full memo archive, board-meeting minutes from the past five years, the language of the mission and theory of change, and the specific rhetorical patterns that Riverbend's board responds to. The draft uses Riverbend's actual voice, references the relevant precedents, anticipates the two questions the chair always asks, and includes a recommended decision framework that mirrors how the board has made similar decisions before. The director edits at the level of substance, not voice.

Use Case C: A new staff member trying to understand a program

A new program associate, three weeks into the job, wants to understand the history and theory behind one of Riverbend's signature programs.

Naked prompt: Generates a competent-sounding but invented history. Hallucinates dates. Misattributes the program's origin. The associate, not knowing better, internalizes errors.

Context-laden prompt: With access to the program one-pager, produces a more accurate but shallow summary. Cannot answer follow-up questions about why the program shifted methodologies in 2019 or why a particular partner was let go in 2021.

Full agentic context: The agent draws on the full program history archive, internal evaluation reports, board memos, and staff retrospectives. The associate can interrogate the program's history at depth, ask why decisions were made, see the actual data behind methodology shifts, and understand the program as it is currently being lived rather than as a static artifact. Three weeks of onboarding compress into an afternoon. Institutional memory becomes something a new hire can actually access.

Use Case D: Surfacing risks the leadership team has missed

A development director, late on a Friday, asks the agent a deliberately open question:

"Are there donor relationships at risk that we are not actively managing?"

Naked prompt: Cannot answer. Has no data.

Context-laden prompt: Can offer generic frameworks for donor attrition. Useless.

Full agentic context: The agent runs across the CRM, connections graph, and recent communications, identifies seven previously high-engagement donors whose touchpoints have dropped below historical baselines, cross-references against any known life events or news, and produces a ranked watch list with suggested next actions. None of these signals were buried in any single dashboard. They emerged from the model's ability to query an integrated context.

This is the use case the MIT report repeatedly emphasizes: AI's biggest measurable returns come from back-office workflows where integrated context lets the system see what humans would otherwise miss. The 5% of pilots that work tend to look like this.


What Becomes Possible When Riverbend's Intelligence Is Integrated

Once the integration work is done, the use cases compound:

  • Major-gift preparation in minutes, not hours, with the full institution behind every brief.
  • Drafting partner for memos, proposals, and grant reports, in Riverbend's actual voice.
  • Onboarding compressed from quarters to weeks, because institutional memory is queryable.
  • Risk surfacing across donor, program, and operational data, because the agent can see across silos.
  • Faithful external communication, because the same grounded context that informs internal work also informs the website, the newsletter, and the campaign.
  • Defensible AI, because every output can cite its source in the corpus. Hallucination drops sharply when the model retrieves before it generates.
  • Continuity through staff turnover, because the institution's intelligence no longer walks out the door with departing employees.

Each of these is a use case that fails under a naked prompt, partially succeeds under a context-laden prompt, and becomes durable only under a fully integrated agentic context.


Part 4: Why the 95% Are Failing

Return to the MIT finding. The report names several patterns in the failed pilots. Each one is a context problem in disguise.

Pilots that don't learn from workflows. A model deployed without persistent memory or feedback loops is permanently a tourist. It cannot improve because it has no integrated record of what worked, what didn't, and why. The 5% that succeed bake feedback into the system.

Tools that don't adapt to context. A generic chatbot dropped into a specialized organization is, by definition, undercontextualized. It knows everything in general and nothing in particular. Integration gives the model the particularity that generality cannot.

Internal builds that fail more often than partnerships. The report observes that purchased, specialized tooling succeeds about 67% of the time, while internal builds succeed at one-third that rate. This is not a verdict on internal capability — it is a verdict on integration discipline. Vendors who succeed have already done the hard work of wrapping models in workflow, memory, and grounded data. Internal builds that try to skip that step inherit the 95% failure rate.

Most spending going to sales and marketing while the biggest ROI sits in back-office workflows. This is fragmentation revealed by spend allocation. Sales and marketing are visible; back-office integration is invisible. Organizations spend where they can see, not where the integration lift is highest.

The single common thread: the failed pilots tried to deploy intelligence without first integrating it. They expected the model to compensate for the missing context. Models cannot compensate for absent context. They can only amplify whatever context is — or is not — there.


Part 5: The Sequence That Survives the Divide

There is a sequence underneath all of this. It is the same sequence that determines whether a movement leader's work outlives him and whether a nonprofit's mission scales with integrity:

Fragmentation → Integration → Activation → Formation → Multiplication.

AI is the activation layer. It is the point at which integrated intelligence becomes usable at scale. But it cannot be the integration layer. It cannot do for an organization what the organization has not yet done for itself.

The leaders and organizations on the right side of the GenAI Divide are not the ones with the best prompts. They are the ones who did the integration work first — gathered the corpus, structured the relationships, wrote down the voice, unified the data — and then put a model on top of it.

The leaders and organizations stuck inside the 95% are running models against fragments and wondering why the output disappoints. The model is not the problem. The fragments are.

Both examples in this article — the missiologist and Riverbend — describe the same arc. In both cases, a naked prompt produces something close to useless. A context-laden prompt produces something better but unreliable. A fully integrated agentic context produces something durable, faithful, and genuinely transformative. The distance between those three tiers is not measured in tokens or model weights. It is measured in how much of the organization's intelligence has been gathered, structured, and made retrievable.

This is what we mean when we say context changes everything. The model is the same. The prompt may even be the same. What changes is what the model can reach.

When intelligence has been integrated, AI extends a leader's voice without distorting it, lets a development officer walk into a meeting with the full institution behind her, lets a new staff member inherit forty years of memory in an afternoon, and lets a movement reproduce itself in places its founder will never visit. When intelligence remains fragmented, AI produces the same generic output that 95% of enterprise pilots are quietly producing today — and the leaders and organizations who deployed it learn the hard way that the model was never the bottleneck.

The bottleneck is the integration that was never done.

The 5% are the ones who did it.


Sources

ShareEmail

Continue reading

More from the Movemental library