Your API Key Doesn't Belong in the App

Table of Contents

Apple just made Claude a drop-in language model on its platforms. The Claude for Foundation Models Swift package conforms Claude to the Foundation Models framework’s LanguageModel protocol, so you drive it with the same LanguageModelSession API you already use for Apple’s on-device model: respond(to:), streaming, structured output, and tool calling, all unchanged. The quick start is five lines:

let model = ClaudeLanguageModel(
  name: .opus4_8,
  auth: .apiKey(ProcessInfo.processInfo.environment["ANTHROPIC_API_KEY"] ?? "")
)
let session = LanguageModelSession(model: model)
let response = try await session.respond(to: "Plan a 4-day trip to Buenos Aires.")

It works on the first run. It is also a trap. That API key is now a string in your shipping binary, and the rule that decides this is not negotiable: a credential that crosses an untrusted boundary is already burned.

A shipped secret is a leaked secret #

A mobile app is not a server. It is a file you hand to every user and every attacker, and they get to keep it. Anyone can pull the .ipa, run strings over the binary, decompile it, or sit a proxy in front of the network traffic. Obfuscation raises the cost of extraction; it never prevents it. Encrypting the secret on the device is the same bargain with a longer receipt: the key that decrypts it has to ship too, so the secret only changes hiding places and gains no real defense. If the secret reaches the device, assume it is extractable.

This isn’t theory, and it gets expensive. I pulled the evidence together across two decades of breaches, and the same failure repeats:

Lenovo Superfish, 2015. Every affected laptop shipped the same root CA private key inside the binary, its passphrase recoverable with strings in minutes, which let anyone forge trusted certificates and intercept HTTPS traffic unnoticed. The tab: an FTC settlement plus a 32-state action with a $3.5 million payment and 20 years of security audits.
Symantec, 2022. A scan found 1,859 apps shipping hardcoded AWS credentials; roughly three in four held live tokens. One banking SDK exposed over 300,000 biometric fingerprints.
CloudSEK, 2022. 3,207 apps were leaking Twitter API keys, and 230 of them carried all four OAuth credentials, enough for full account takeover.
Truffle Security, 2026. 2,863 live Google API keys scraped from public sites reached straight into Gemini, billable to whoever owned them. New model, same mistake.

The pattern doesn’t end; it only changes which key is worth stealing.

A token for a translation service should not unlock all of S3, and a key in an app should not be treated as private. Any secret shipped to a client, whether a compiled binary, an APK or IPA, or client-side JavaScript, must be treated as already compromised.

Anthropic says the same thing in its own documentation. The development-only .apiKey mode carries an explicit warning that a bundled key is extractable from the shipping binary, and the production answer is a proxy:

Your proxy receives standard Messages API requests, attaches the x-api-key header, and forwards them to https://api.anthropic.com.

The Claude key is this class of secret. So build that proxy, and make it earn its keep.

The proxy is your boundary: call it a BFF #

The fix is structural. Move the secret to a server you control, and give the app nothing but a URL. The app talks to your endpoint; your endpoint talks to Anthropic. The key never leaves your infrastructure.

This pattern has a name: the Backend for Frontend, or BFF. A BFF is a thin backend dedicated to one client. It owns the credentials, shapes and authorizes the calls, and hands the frontend what it needs and nothing it shouldn’t have. For our purposes the job is narrow: hold the Anthropic key, decide who’s allowed to spend it, and proxy the Messages API.

Where you run that BFF matters. A regional origin server puts a single round-trip city between every user and every token. Cloudflare Workers run the same logic in 300-plus cities, so the credential injection happens a few milliseconds from the caller wherever they are. That is the difference between a proxy bolted onto one data center and a boundary that exists everywhere your users do. A BFF at the edge.

A regional origin sits one long round trip away. An edge BFF answers from a city beside the user, so the same request finishes many times over before the distant server replies once.

Gate the proxy with the purchase #

Most proxy tutorials skip the next part. A bare proxy that injects your key is an open relay for your API bill. Anyone who finds the URL can spend your money. The proxy has to answer one question before it forwards anything: is this caller entitled to use Claude?

For a paid app, the answer is the in-app purchase. StoreKit 2 hands the app a cryptographically signed transaction: a JWS whose x5c header carries Apple’s certificate chain. The app forwards that JWS to the BFF in a header. The BFF verifies the signature against Apple’s root, confirms the transaction is for one of your apps, and checks that the entitlement is live. No round-trip to Apple in the request path; the signature is self-contained.

Trial and paid collapse into the same check. A free trial is just a subscription whose expiresDate is in the future, so “is this entitlement current?” covers both. A trial user and a paying user clear the same gate; a lapsed or refunded one does not.

The Worker #

The BFF reads the signed transaction from the X-IAP-Transaction header, verifies it against Apple’s chain, checks the transaction’s bundleId against an allowlist you set in Worker configuration, confirms the entitlement is current, and only then injects x-api-key and forwards the request to an allowlisted upstream path. A failed check returns 401/403 and never touches Anthropic.

It reads as one small module per concern: the configuration surface, the pure entitlement decision, the verifiers, the revocation record, the spend controls, and the proxy. Each gate is optional and configuration-driven: turn on the purchase check, the attestation check, or both, and a Worker with neither configured refuses to run rather than relay your key for free. The configuration itself is a handful of public variables plus one secret that never appears in source control.

The full, runnable code lives in the edge-proxy walkthrough: every module, the offline test suite, the crypto and configuration detail, and the steps to deploy it for real.

App Attest: the other half, and the fallback #

In-app purchase answers “is this user entitled?” It does not answer “is this even my app talking?” Those are different questions, and the second one is where Apple’s App Attest fits. App Attest gives you a hardware-backed assertion that a request comes from a genuine, unmodified instance of your app on a real device, not a script replaying an extracted token.

The two gates complement each other. A paid app should want both: App Attest proves the caller is your real binary, and the IAP transaction proves that binary belongs to an entitled user. The BFF can check the attestation first and the entitlement second, and only a request that clears both gets a key.

App Attest is also the answer when there’s no purchase to gate. Plenty of apps are free, ad-supported, or aren’t monetizing the Claude feature, so there’s no IAP for the BFF to check. The goal there is to keep scripts and extracted-key replays off the endpoint, with no purchase in the picture. Same BFF, same edge, same injected secret. You swap the IAP check for an App Attest verification and gate on authenticity instead of entitlement.

Decide what you’re protecting. If you’re selling access, gate on the purchase. If you only need to keep non-app traffic out, attest the app. If both matter, do both. The BFF is the one place that can.

The app ships no key #

On the client, this is a one-line change from the trap we started with. The package’s .proxied auth mode points the session at your Worker and sends your authorization headers on every request, and the app holds no API key at all. The client helper pulls the current entitlement’s signed representation from StoreKit, adds an App Attest assertion when the device supports it, and routes Claude through the Worker.

The LanguageModelSession on top of this is identical to the one you’d write against Apple’s on-device model. The proxy is invisible to the rest of the app: a base URL and a header. The secret lives at the edge, where it belongs.

What the BFF still can’t do #

The gate decides who may call, and the Worker bounds what a call costs: a model allowlist, a max_tokens cap, and a per-caller daily token budget, metered in a Durable Object at the edge. It also learns about refunds from Apple directly, recording revocations the moment they happen instead of trusting the token in the caller’s hand. The walkthrough carries each of those as a working module.

The account-level backstop sits at Anthropic. Workspace spend limits and per-key budget tracking cap the blast radius no matter what the edge forwards, and key rotation is one wrangler secret put ANTHROPIC_API_KEY away the moment a key looks tired. That is the real payoff of moving the secret to one place you control: the same place becomes where you meter, rate-limit, and rotate.

A Worker cannot turn anonymous access into accountability. Every caller here is a pseudonym, a purchase receipt or a device key, so the budget caps what one pseudonym can spend, and that cap is the ceiling of handing out an API you pay for without accounts. You cannot tell two devices of one abuser apart, ban a person rather than a receipt, or answer who spent this with a name. When the feature is worth more than the ceiling, require authentication: Sign in with Apple or your own account system maps every request to a person you can meter, throttle, and offboard by name.

One pattern, every client #

Claude on Apple’s framework is the worked example, but the shape is general, and that’s the point of this series. Any secret you’re tempted to ship in a client, whether a mobile binary, a single-page app, or a desktop bundle, belongs behind a BFF instead. One place holds the credential. One place decides who’s authorized. One place meters, rate-limits, and rotates. Run it at the edge and that one place is everywhere at once.

The key was never supposed to be in the app. Put it where you can defend it.