On March 24, 2026, between 10:39 and 16:00 UTC, two versions of LiteLLM appeared on PyPI that nobody on the LiteLLM team had published. Versions 1.82.7 and 1.82.8 looked legitimate. They had the right package name, plausible version numbers, and installed without complaint. What they also did, silently, on every Python process startup, was harvest credentials from the machine and ship them to an attacker-controlled server.
By the time PyPI quarantined the packages roughly three hours later, they had been downloaded by an unknown fraction of the 3.4 million daily installs LiteLLM typically receives. Over 20,000 repositories had transitive exposure. The attack was not a typosquatting campaign or a fake package. It was a direct takeover of the real LiteLLM distribution on PyPI, made possible by credentials that the group calling itself TeamPCP stole from LiteLLM's own build pipeline through a compromised security scanner.
But here is the part of the story that matters most for business leaders: organizations with mature supply chain security practices were not affected. The controls that would have prevented this exposure are well-understood and implementable. The gap is not technical knowledge - it is organizational prioritization.
Why AI Infrastructure Is a High-Value Target - and What That Means for Security Strategy
LiteLLM is the standard abstraction layer for calling large language models in production Python applications. It provides a unified API for OpenAI, Anthropic, Google, Cohere, Mistral, and dozens of other providers. At 95 million downloads per month, it is one of the most widely installed AI packages in the Python ecosystem.
That architecture creates a concentrated credential footprint. A typical enterprise LiteLLM deployment sits in an environment that holds API keys for multiple AI providers, cloud storage credentials for training data, database connections for retrieval-augmented generation (using stored documents to improve AI responses), and Kubernetes access for deployment orchestration. One compromised install yields not one key but a portfolio of keys spanning multiple systems and providers.
This is not a reason to avoid AI abstraction layers. It is a reason to treat them with the same security rigor as any other credential-bearing infrastructure. In practice, the organizations we work with that get this right apply three principles: they isolate credentials from install-time processes, they verify the provenance of every dependency, and they restrict what the network allows during builds. Each of these would have neutralized this attack entirely.
How the Attack Chain Worked
Understanding the mechanics is important because it reveals the specific control points where the chain could have been broken.
The chain began on February 27, 2026, when an attacker exploited a "Pwn Request" vulnerability in Aqua Security's Trivy, the widely used open-source vulnerability scanner. Trivy's GitHub Actions workflow used pull_request_target, a trigger type that runs code from incoming pull requests with elevated repository privileges, including access to CI/CD secrets. The attacker submitted a crafted pull request that exfiltrated a service account token with write access to Trivy's repositories.
Aqua Security discovered the breach and rotated credentials. But the rotation was not atomic. During the rotation window, TeamPCP retained access long enough to inject credential-stealing payloads into Trivy's GitHub Actions and Docker Hub distributions.
On March 19, the compromised Trivy Actions went live. On March 23, the same technique was used to compromise Checkmarx's KICS and AST GitHub Actions. Each compromise harvested new credentials that enabled the next attack.
Here is where it connects to LiteLLM. LiteLLM's CI/CD pipeline ran Trivy as part of its build process, pulling it without a pinned version. When the compromised Trivy binary executed inside LiteLLM's GitHub Actions runner, it operated with the full permissions of that runner, including access to the PYPI_PUBLISH token stored as a repository secret. The malicious Trivy exfiltrated that token to a typosquatted domain.
With LiteLLM's PyPI publishing credentials in hand, TeamPCP published litellm 1.82.7 at 10:39 UTC and 1.82.8 at 10:52 UTC on March 24. No maintainer account password was needed. No two-factor authentication was bypassed. The attackers had a valid publishing token extracted from the build pipeline itself.
The critical insight: this attack succeeded because a security tool itself became the vector, and because the publishing token was a long-lived, broadly scoped credential. Both of these are addressable with practices that mature security teams already follow.
The Three-Stage Payload
The malicious code inside versions 1.82.7 and 1.82.8 operated in three distinct stages.
Stage 1 - Collection. The attackers embedded a malicious .pth file called litellm_init.pth inside the package. Python processes .pth files automatically at startup, before any user code runs. This file ran on every Python process on the affected machine, not just in applications that imported LiteLLM. It harvested SSH keys, .env files, AWS credentials, GCP service account files, Azure credentials, Kubernetes configuration files, database passwords for MySQL, PostgreSQL, MongoDB, and Redis, .gitconfig files, shell history, Solana crypto wallets, Slack and Discord webhook tokens, CI/CD configuration files, and all environment variables.
Stage 2 - Exfiltration. The harvested data was packaged as an encrypted archive using a hybrid encryption scheme: AES-256-CBC for the data, with the session key wrapped in 4096-bit RSA. The archive was sent via HTTP POST to https://models.litellm.cloud/. That domain is a typosquatted replica of legitimate LiteLLM infrastructure, designed to blend into network logs.
Stage 3 - Lateral movement. If Kubernetes credentials were present, the malware attempted to create privileged pods in the kube-system namespace on every cluster node, mounting the host filesystem and installing a persistent systemd backdoor that polled for additional binaries.
Each stage has a corresponding defensive control. Credential isolation stops Stage 1 from finding anything valuable. Egress filtering stops Stage 2 from reaching the attacker. Network segmentation and Kubernetes admission policies stop Stage 3 from spreading.
How It Was Discovered - and What That Tells Us About Detection Gaps
The attack was not caught by a security scanner or a registry audit. It was caught because it broke something.
Engineers at FutureSearch noticed the malicious code when LiteLLM was pulled in as a transitive dependency by an MCP plugin running inside Cursor. The malicious .pth file triggered a fork bomb that crashed the machine. The crash was anomalous enough to prompt investigation, and the investigation led back to the new LiteLLM versions published that morning.
The fork bomb was almost certainly unintentional on the attackers' part. A credential harvester that crashes machines gets noticed. The attack was designed to run silently, and for most of the three hours it was live, it presumably did.
This highlights an important gap in current detection capabilities. The attack was not detected by any of the standard security tools, package scanners, or monitoring systems that organizations rely on. Organizations that had runtime behavior monitoring - watching for unexpected file access patterns and network connections during package installation - would have caught this even without the accidental crash. That kind of behavioral detection is a meaningful layer in a defense-in-depth approach to supply chain security.
A Cascading Campaign - and the Control Points That Break It
LiteLLM was not TeamPCP's only target. It was one node in a cascading supply chain campaign that has now crossed five ecosystems: GitHub Actions, Docker Hub, npm, OpenVSX, and PyPI.
The timeline shows how each compromise enabled the next:
February 27, 2026: TeamPCP exploits a Pwn Request vulnerability in Aqua Security's Trivy CI/CD pipeline, exfiltrating a service account token with write access. Aqua discovers the breach and rotates credentials, but the rotation is incomplete.
March 19: TeamPCP uses retained access to inject credential-stealing payloads into Trivy's GitHub Actions and Docker Hub distributions. The compromised scanner begins harvesting CI/CD secrets from every project that runs it.
March 23: Using credentials harvested from Trivy's downstream users, TeamPCP compromises Checkmarx's KICS and AST GitHub Actions. They forcibly override 76 out of 77 version tags, redirecting pinned versions to malicious code.
March 24: The compromised Trivy binary running in LiteLLM's CI/CD pipeline exfiltrates LiteLLM's PyPI publishing token. TeamPCP publishes backdoored versions 1.82.7 and 1.82.8.
March 27: TeamPCP uses the same playbook to compromise the Telnyx telephony SDK on PyPI, publishing malicious versions 4.87.1 and 4.87.2, with the real payload hidden in the audio frame data of a WAV file fetched at runtime.
The pattern is instructive. Each compromise yielded credentials that unlocked the next target. But the chain is only as strong as its weakest link - and there were multiple points where the right controls would have broken it. Pinned and hash-verified build dependencies would have prevented the compromised Trivy from entering LiteLLM's pipeline. Short-lived, narrowly scoped publishing tokens (or PyPI's Trusted Publishers feature for keyless publishing) would have made the stolen credential useless. A private registry mirror with provenance verification would have rejected the tampered package before it reached any downstream user.
Four Organizational Practices That Would Have Prevented This
The security industry's analysis of this attack has focused on the technical chain. That analysis is correct but incomplete. The more important question is organizational: what decisions left so many companies exposed, and what do the organizations that were not affected do differently?
The specific practices that separate protected organizations from exposed ones are worth examining because they recur across nearly every security assessment we conduct:
1. Dependency provenance verification. LiteLLM's CI/CD pipeline pulled Trivy without a pinned version. When the compromised Trivy was served, it was accepted without verification. Organizations that pin dependencies to specific hashes and verify signatures before execution would have rejected the compromised binary at the gate. This is not exotic - it is the same practice that mature teams apply to any build-time dependency.
2. Short-lived, scoped publishing tokens. The PYPI_PUBLISH token that TeamPCP stole was a static credential with unlimited scope and no expiration. PyPI's Trusted Publishers feature enables keyless publishing that eliminates this entire class of attack. For organizations that must use tokens, short-lived credentials with narrow scope dramatically reduce the window and impact of a compromise.
3. Credential isolation at install time. When LiteLLM was installed in development and CI/CD environments, those environments had live credentials for production cloud services, databases, and AI providers sitting in environment variables and configuration files. The malicious .pth file harvested them because they were there. An environment that isolates install-time processes from production credentials yields nothing worth stealing - the attack succeeds technically but fails operationally.
4. Network-level controls on package installations. The exfiltration to models.litellm.cloud succeeded because nothing restricted outbound network access from the install environment. Organizations that route package installations through a private registry and restrict egress block both the compromised install and the data exfiltration.
Each of these is an organizational decision, not a technical limitation. They represent the difference between treating the AI dependency chain as experimental software and treating it as the critical infrastructure it has become.
The Governance Gap: Why AI Infrastructure Gets Less Scrutiny Than It Should
Consider the paradox. Your payments processor goes through SOC 2 audits, penetration testing, and annual vendor reviews. Your AI abstraction layer, which holds API keys for every model provider you use alongside your cloud credentials and database passwords, gets installed with a single command from a public registry with no verification. The risk surface of the second is arguably larger than the first, and yet almost no organization subjects it to equivalent scrutiny.
This is not because security teams are negligent. It is because AI infrastructure grew up outside the traditional IT governance structure. AI projects often start as experiments, grow into production systems, and accumulate credentials along the way without ever passing through the procurement, vendor review, or security architecture processes that govern other critical software. By the time the AI stack is load-bearing, the credentials are already embedded and the dependency graph is already deep.
The pattern we see across organizations that have closed this gap is straightforward: they made a deliberate decision to bring their AI stack under the same governance framework as their other critical systems. Not a separate framework. Not a lighter-weight version. The same one. The December 2024 Ultralytics attack - where a compromised version of the popular computer vision library turned 60 million downloads into cryptomining bots via a nearly identical GitHub Actions compromise - showed that this is a persistent and recurring threat pattern, not a one-time event.
Several improvements are underway across the ecosystem. Package registries are investing in provenance attestation and code signing. PyPI's Trusted Publishers feature eliminates the need for static publishing tokens entirely. The Python ecosystem is gradually adopting standards that let publishers sign packages cryptographically. SBOM requirements are pushing organizations to inventory dependencies more rigorously. These are meaningful improvements, and organizations that adopt them early gain a concrete security advantage.
Immediate Remediation for Affected Environments
If your environment may have been affected, the remediation steps are straightforward.
First, check which version you have: pip show litellm. If the output shows 1.82.7 or 1.82.8, remove the package and purge caches. For uv users: rm -rf ~/.cache/uv. For pip: pip cache purge.
Second, search for persistence indicators. The file to look for is ~/.config/sysmon/sysmon.py. Also check for a systemd service named sysmon.service. If either exists, the backdoor installation was attempted, and a full incident response is warranted.
Third, rotate every credential that was present on the affected machine. That means SSH keys, cloud access tokens for AWS, GCP, and Azure, database passwords for any system the machine could reach, Kubernetes service account tokens, and all API keys including those for AI providers. Rotation is not optional if the machine installed the malicious versions during the exposure window.
Five Questions to Assess Your AI Supply Chain Security
Whether or not you were affected by this specific incident, these questions will help you evaluate where your organization stands:
1. What does our AI dependency graph look like? If nobody can produce a complete list of the packages your AI systems depend on, including transitive dependencies, that is the first gap to close. You cannot secure what you cannot see.
2. Where do our AI provider credentials live? If the answer includes environment variables on developer workstations, CI/CD runners, or production servers without isolation, those credentials are one compromised dependency away from exfiltration. The fix is credential isolation - ensuring that install-time and build-time processes cannot access production secrets.
3. Do we verify the provenance of what we install? If your build pipeline pulls packages from public registries without hash pinning, signature verification, or a private mirror, the integrity of your software depends entirely on every maintainer account and publishing token in your dependency chain remaining uncompromised. Provenance verification removes that assumption.
4. Would we detect a compromised dependency within hours? If your monitoring would not flag a malicious .pth file executing on Python startup, or encrypted data being POSTed to an unfamiliar domain, behavioral monitoring for your build and install environments is worth evaluating.
5. Does our AI stack go through the same security review as our other critical infrastructure? If not, the governance gap is the single highest-leverage item to address. Bringing AI dependencies under the same review process as payments or identity infrastructure closes the most common exposure pattern we see.
The broader lesson of the TeamPCP campaign is that AI infrastructure is software infrastructure. It has dependencies. It handles credentials. It processes sensitive data. The organizations that treat it accordingly - with dependency verification, credential isolation, network controls, and governance oversight - were not affected by this attack and will not be affected by the next one. The security practices are known, they are implementable, and the cost of implementing them is a fraction of the cost of responding to a supply chain compromise after the fact.