Your AI Stack’s Attack Surface: What to Lock Down Now

TMLS Insights | Week of May 4, 2026

This is the first in a series of practitioner-focused posts by Graham Toppin

Graham is a Co-founder and Analyst at Peerlabs.ai, a subscriber-funded intelligence firm focused on primary research on emerging technology.

No news roundups. No vendor hype. We want to give you concrete advice you can test this week, grounded in what we’re hearing from practitioners and seeing in our own work.

Two incidents from the last five weeks illustrate the same point: the security risk in your AI system is not (just) the model. It’s everything around it. The sophistication of these attacks, and the orthogonal nature of them – they didn’t attack your PROMPTs – signals a sophistication beyond the more tangible Day Zero bugs we’ve seen with the Claude Mythos hype-cycle.

Incident 1: Trivy → LiteLLM supply chain cascade (March 19-24)

A threat actor (TeamPCP) compromised Trivy, an open-source vulnerability scanner, by rewriting Git tags in its GitHub Action repository. Multiple security firms (Snyk, Wiz, Kaspersky) have reconstructed a plausible chain linking this to the subsequent compromise of LiteLLM: the compromised Trivy action exfiltrated PyPI publishing credentials from LiteLLM’s CI/CD runner, and malicious LiteLLM versions were published directly to PyPI containing a multi-stage credential stealer.

LiteLLM’s own disclosure hedges: ”we believe this incident may be linked”. The connection should be treated as strongly indicated but not definitively confirmed.

LiteLLM is present in 36% of cloud environments as a dependency of CrewAI, DSPy, Browser-Use, Mem0, Instructor, and Guardrails. A single compromised installation exposes API credentials for every model provider it connects to.

Incident 2: Cursor agent deletes production database (April 25)

A Cursor agent running Claude Opus 4.6 deleted a production database and all volume-level backups via a single Railway API call. The agent later produced a written explanation of the safety rules it had violated. The model didn’t hallucinate or jailbreak. It executed a legitimate API call through a legitimate tool with legitimate credentials that had no scope restrictions.

What you can do this week:

Dependency management (30 minutes):

Check whether your AI projects use lockfiles.poetry.lock,uv.lock,package-lock.json,yarn.lock. Repos using lockfiles were completely protected from the LiteLLM attack. Repos doing bare pip install were not.
Run pip audit or safety check against your current dependencies.
If you’re using LiteLLM, pin to a verified version. If you were on 1.82.7 or 1.82.8 between March 24-25, assume credential compromise and rotate all API keys.
If you’re using GitHub Actions, pin to known good versions in your evaluation chain. Actively manage your vendored dependencies.

Agent credential scoping (1 hour):

Audit what API tokens your agents have access to. If an agent can call a destructive API endpoint (delete, drop, terminate), it will eventually call it.
Apply principle of least privilege: read-only tokens for agents doing read-only work. Separate tokens per environment. No production credentials in development harnesses.
If your agent framework supports tool filtering, whitelist the specific tools/endpoints it needs rather than granting blanket access.
Block destructive CLI commands. If your agent has shell access, it should not have curl access to destructive API endpoints without explicit confirmation gates. It’s worth noting: these protections aren’t really addressed by containerization. If you give your agent privileges, no matter where it runs, the blast radius is defined by those privileges.

Broader assessment (ongoing): This doesn’t mean you should ignore prompt injections:

If your security review process tests only prompt injection and jailbreaks, it’s covering one layer of a multi-layer problem and worth reviewing the methodology.

Jason Haddix (Arcanum Information Security) has published an open-source prompt injection taxonomy and a seven-point methodology for assessing AI-enabled systems (https://arcanum-sec.github.io/ai-sec-resources/ and https://github.com/arcanum-sec/ai-sec-resources)

His argument, backed by practitioner assessments: roughly 90-95% of prompt injections can be trained against; however, getting full coverage requires layered mitigations across the full system, not just the model.

The assessment checklist covers: APIs, data pipelines, RAG sources, function-calling interfaces, agent-to-agent communication channels, and traditional web vulnerabilities (SSRF, XSS, IDOR in chat UIs) amplified by AI context.

If you’re running open-weight models in production, or if you’ve built internal tooling around agent context management, we’d like to hear from you. We’re collecting practitioner accounts (attributed or anonymous, your choice) for a more systematic write-up.

Reach out at [email protected]

TMLS Insights is produced by the TMLS Steering Committee and Peerlabs. We produce practitioner-focused analysis for the Toronto Machine Learning Society and MLOps World communities. Aspects of our research pipeline use AI; all claims are human-reviewed and sourced.

Share This Post