AI Gateway as a Compliance Control Point
Vercel shipped AI Gateway to general availability in August of last year, and comparable products are in-flight from the other major platforms. The pitch is uniform: one API across OpenAI, Anthropic, Google, Meta, Mistral, the open-weights ecosystem, with automatic fallback and a single billing surface. The developer-experience case is legitimate and well-documented. The compliance case is not yet being made at the level it deserves, and it is the more consequential one.
A gateway is the only architectural chokepoint a regulated organization reliably has for AI traffic. If you do not control the chokepoint, you do not control the AI estate, and no amount of policy writing will produce evidence an assessor will accept.
Why provider SDKs lose to gateways on compliance
Most organizations started their AI work by wiring individual teams directly to individual provider SDKs. The pattern is organic — the team working on document summarization picks Anthropic, the team working on embeddings picks OpenAI, the team doing structured extraction ends up on a third provider because it tested better for that task. Each team signs its own DPA, configures its own retention settings, negotiates its own data-residency terms, and instruments its own logging.
The compliance failure mode of this pattern is invisible until the first audit. Evidence of the AI controls — what models are in use, which provider is processing which data class, whether zero-data-retention is actually enforced, where logs live, who has accessed them — is spread across seven SDK integrations with seven different observability stories. Each team is confident its own piece is covered. Nobody can produce the whole picture on demand, which is exactly what an assessor asks for.
A gateway collapses this to a single control point. One place logs every request. One place enforces provider selection. One place pins model versions. One place implements the data-residency routing. The control surface becomes a single system with a single audit interface instead of N systems with N audit interfaces. This is the same move compliance-mature orgs made with egress proxies twenty years ago, and the logic is identical.
The controls a gateway can enforce that SDKs cannot
Three controls matter, and they are all much easier at the gateway than at the application.
Provider-level data-residency routing. Your EU-customer data should not touch a US-only inference endpoint. At the SDK level, this is enforced by asking every team to remember which provider and region to select — a policy statement with no mechanism. At the gateway level, it is a routing rule. The gateway inspects the request context, selects a provider endpoint that satisfies the residency obligation, and rejects the request if no compliant option exists. The rule is testable. The evidence is produced as a byproduct of operation.
Model-version pinning. Regulated teams need change control on the inference layer. When a provider silently updates a model under a stable name, the behavior of your production system changes without a change record. At the SDK level, pinning requires every team to remember to pin and to notice when they drift. At the gateway level, pinning is a policy — the gateway rewrites model identifiers to specific snapshots, blocks requests to unpinned models, and produces the change record when a pinned version is advanced.
Per-class zero-data-retention enforcement. Provider ZDR terms are real contracts with real operational semantics, but they require that the calling code pass the correct headers or opt-in flags. A team that forgets, or a new team that did not know, is not covered by the ZDR contract even though the org signed one. At the gateway level, ZDR is enforced on every outbound request by default, overridable only through an explicit, audited code path.
What a gateway is not, by itself
A gateway does not solve prompt injection. It does not produce an AI bill of materials for the models it proxies. It does not replace model-card review or red-teaming. It is not a substitute for the controls that live inside the application, where the prompts are built and the completions are used.
What it does is concentrate the perimeter. Before the gateway, the AI perimeter was every piece of application code that could reach a provider API. After the gateway, the perimeter is the gateway configuration and whatever the gateway cannot see. This is a meaningful reduction of attack surface and a meaningful consolidation of audit surface, and regulated orgs should be adopting it on compliance grounds even when the engineering case is lukewarm.
Writing it into the SSP
Teams that are adopting gateways seriously are updating their SSPs in three places.
The boundary diagram shows the gateway as a distinct component with its own authorization inheritance. Provider endpoints are external dependencies reached only through the gateway, not accessed directly from application compute.
The AU-family controls are rewritten around the gateway as the system of record for AI traffic. The gateway log is the primary audit artifact. Application-side logging is the secondary record and is reconciled against the gateway on a documented cadence.
The SA-family controls on supply chain reference the gateway's model-pinning and provider-selection policy as the mechanism by which third-party model changes are gated. Model advancement is a tracked change; a provider renaming a model is not an invisible event.
None of this is novel. It is the same treatment regulated orgs have given API gateways, egress proxies, and secrets managers. The only thing different about an AI gateway is that the industry is still selling it on developer-experience grounds, and the compliance teams have not yet recognized it as the architectural primitive it actually is. The teams that see this early will not have to retrofit the control into their next authorization. The ones that do not, will.