← marwandiallo.comlabs

Hardening that actually contains it

SSRF is rarely fixable in one place. The application has to pass the request, the network has to route it, and the destination has to honor it. Break any of those and the chain dies. Break all four below and you've contained the blast radius even if the next bug lands.

1. Make IMDSv2 required (or platform equivalent)

On AWS:

# Per launch template
HttpTokens: required
HttpPutResponseHopLimit: 1

# Or audit existing instances
aws ec2 describe-instances \
  --query 'Reservations[*].Instances[*].[InstanceId,MetadataOptions.HttpTokens]' \
  --output table

On GCP and Azure, the platform requires a header on every read. That's already analogous to IMDSv2's session-token model — but only against header-stripping SSRF primitives. The next layer is what catches the rest.

2. Egress firewall: link-local is non-egressable

Application processes have no business reaching 169.254.169.254. The metadata service is for the host, not for application code. Block at the host firewall:

# nftables on Linux — drop link-local egress from app uid
nft add rule ip filter OUTPUT \
  meta skuid != 0 ip daddr 169.254.0.0/16 drop

# Or via iptables
iptables -A OUTPUT -m owner --uid-owner app \
  -d 169.254.0.0/16 -j DROP

On Kubernetes, do it at the NetworkPolicy / Calico / Cilium layer:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: deny-link-local }
spec:
  podSelector: {}
  policyTypes: [Egress]
  egress:
    - to:
      - ipBlock:
          cidr: 0.0.0.0/0
          except: ["169.254.0.0/16", "127.0.0.0/8", "10.0.0.0/8"]

3. URL validation: resolve before checking

The mistake that ships in 80% of homegrown URL validators is this:

// BROKEN
function isAllowed(rawUrl: string) {
  const u = new URL(rawUrl);
  if (u.hostname === "169.254.169.254") return false;
  return true;
}

That misses every encoding the analyzer covers: decimal, hex, octal, alias hostnames, DNS rebinding. The fix is to resolve the hostname to an IP, validate the IP against your blocklist, and then make the actual request to that resolved IP (with the original Host header for SNI). That last part — pinning the IP after validation — is what defeats DNS rebinding.

// Slightly better (still simplified)
async function safeFetch(rawUrl: string) {
  const u = new URL(rawUrl);
  if (!["http:", "https:"].includes(u.protocol)) throw "scheme";
  const ips = await dns.resolve(u.hostname);
  for (const ip of ips) {
    if (isPrivateOrMetadata(ip)) throw "blocked";
  }
  // Pin to first resolved IP and pass Host header for SNI
  return fetch("https://" + ips[0] + u.pathname + u.search, {
    headers: { host: u.hostname },
  });
}

In practice: don't roll this. Use a battle-tested library (Python urllib3 with the SSRF protection patch, Go net/http.DefaultTransport with a custom dialer, a fronting proxy like Stripe's Smokescreen).

4. Identity-layer scoping

The reason IMDS theft is catastrophic is that the IAM role attached to the instance is usually over-privileged. Even if layers 1–3 fail, a tightly-scoped instance role limits the blast radius:

  • One role per instance type / per application. No shared "general-app" role.
  • No wildcard Resource: "*" on sensitive actions (S3 GetObject, Secrets Manager, KMS).
  • Use IAM session tags + request conditions to scope role usage by request context where possible.
  • Detective controls: CloudTrail alerts on any IMDS-derived credential used from outside the instance's expected egress range — Capital One would have been caught here.

The ranking that matters

If you can only ship three things this quarter:

  1. Egress-block link-local from app processes. One firewall rule. Kills the highest-impact attack class regardless of validator quality.
  2. Set IMDSv2 required on every AWS instance. One config flag. Gigantic upgrade against header-stripping SSRF primitives.
  3. Stop hand-rolling URL validators. Use a fronting proxy or a battle-tested library. The bypasses are too many to maintain in-house.

The agent angle

Every URL-fetching AI agent inherits this entire threat model. If the agent can be prompt-injected into fetching a URL, your validation chain has to be at least as strict as if the user had submitted it. In most agent deployments today, it isn't.