The Compliance Cost of a Million-Token Context
Anthropic shipped Claude Opus with a 1M-token context window earlier this year, and Google, OpenAI, and the open-weights ecosystem are within a quarter of matching it. The framing everywhere is productivity. The engineering framing is mostly correct — dropping a whole codebase, an entire ticket thread, or a full design review into a single request removes a class of prompt-stitching complexity that consumed a lot of agent-engineering time.
The compliance framing is that you just changed three controls without writing it down.
Data minimization stopped meaning what it used to mean
Every regulated environment has some version of SC-28 or a data-minimization clause requiring that production systems only process data necessary for the task at hand. Before 1M context, this control was usually satisfied by the natural economics of prompt assembly — you paid per token, so you pruned aggressively, and the pruning doubled as minimization. Nobody wrote it that way in the SSP, but that was the effect.
A 1M-token context changes the economics. The marginal cost of including a whole file instead of the relevant function is small enough that teams stop pruning. A code-review agent that used to see the diff now sees the diff plus the file plus the surrounding module plus the last ten commits touching it. Every one of those tokens is potentially in scope for whatever data-residency, secrets-detection, or PII-scoping obligation applies to the repository.
You have not relaxed the control. The control still says "only necessary data." You have changed what teams consider necessary, and you have done it without an ECR, without updating the SSP, and almost certainly without telling the compliance team. The first time an assessor asks "how do you enforce data minimization in AI-assisted workflows," you will not have a good answer.
Audit log volume is a budget problem before it is a compliance problem
Regulated environments log AI prompts and completions. This is usually written into the SSP under AU-2 or an AI-RMF adjacent control — the record of what the model saw and what it produced. When your average prompt was two thousand tokens, the log volume was tractable. An engineering team doing a hundred thousand AI-assisted requests per day produced a few gigabytes of prompt logs per day. SIEM ingestion and retention economics absorbed it.
A 1M-token prompt is a thousand times larger. The same team doing the same number of requests, with even a fraction of them approaching the new ceiling, is now producing terabytes per day. The SIEM bill changes category. The retention window quietly shrinks because the platform team cannot keep seven years of what is now petabyte-scale log data at the original price point.
Retention shrinkage is a compliance finding. It is not usually discovered until the first assessor asks for prompt records from eight months ago and the team answers that the oldest records on hand are three months back.
The fix is not to cap context at the old ceiling. The fix is to decouple the audit record from the raw prompt — store a hash of the input, a manifest of what was included, and the references used to assemble it, so the record is reconstructable without being megabytes per request. Most teams have not built that pipeline.
Exfiltration surface expands with context, and so does its deniability
An agent with a small context window has a small working surface. If that agent is compromised by a prompt injection, the data an attacker can exfiltrate in one turn is bounded by what is already in context. You do not have to trust the agent with everything; you can trust it incrementally.
With a million tokens of context, the incremental-trust model breaks. Teams default to loading "everything potentially relevant" because the model is good enough to filter, and because the cost of over-inclusion is low. This makes any successful prompt injection a much bigger incident — the attacker has access to whatever was included, and whatever was included is often everything the team considered plausibly useful.
The deniability problem is the quieter consequence. In a small-context regime, an incident involves a small, reviewable set of inputs. In a large-context regime, reconstructing which specific pieces of data were in scope at the moment of the exploit requires the audit pipeline described above — which most teams have not built. "We cannot definitively say what data was exposed" is a worse story to tell than "the following thirty-two records were in scope."
What controls actually look like
Teams that are handling the transition well are doing four things we would include in an SSP update.
First, context budgeting as a documented control. The SSP specifies a target context size and a justification process for exceeding it — not to throttle capability, but to force the conversation about why this task needs that much.
Second, manifest-based audit logging. Each request logs the references (file paths, ticket IDs, record IDs) rather than the resolved content, with content retained in its source system under its existing retention policy. The audit record is reconstructable; the prompt log is not the system of record.
Third, per-scope context ceilings. A code-review agent operating on a public repository gets a larger ceiling than a billing agent operating on customer records. The ceiling is part of the role definition, not a platform default.
Fourth, an updated threat model for prompt injection that accounts for the expanded blast radius. Red-team exercises are run against the new context size, not the old one.
None of this is exotic. It is the same discipline regulated teams apply to every other control — write it down, measure it, review it. The 1M-token context is a capability change that looks like a configuration change and behaves like a control-environment change. Treat it as the latter.