uv lockfiles and malware on PyPI
A potentially surprising interplay between these two systems means you might get compromised by deleted packages.
I’m pretty sure that saying “software supply chain security is important” won’t win anyone a prize, yet we see quite an alarming ramp-up in malicious campaigns that take advantage of various gaps in the space, despite all the industry attention. Some problems take time to fix, but there’s a lot more we could be doing today that isn’t for lack of technology, but organizational friction. Anyway, we’re not here to discuss that right now.
The LiteLLM compromise was interesting to watch, especially the
impact on downstream consumers. Essentially, everyone that installed or happened
to update the LiteLLM package (or something that depended on it) got
compromised, as the newest (malicious) version was pulled in their environment.
One of the ways to reduce this kind of thing from happening is to use lock files
in the package managers that support them. Lock files are a record of all
dependencies for a given package, including versions and cryptographic hashes.
Updating the lock file happens intentionally, during development, but deploying
the package in production uses the lock file as-is, installing known-good
versions as specified. This also has some other non-security related properties,
like making sure the versions of packages you’re deploying won’t magically break
things because of upstream changes. So you do your dev work, add deps as needed,
update the lock file, and ship that. It’s also best practice to pin your
dependencies to a specific version (like ==X.Y.Z) instead of using softer
constraints that may allow for updates (e.g., >=X.Y.0 installing X.Y.23).
This pinning usually happens in a configuration file adjacent to the lock file,
like pyproject.toml or Cargo.toml.
The surprise
Let’s talk concretely about Python:
- Define dependencies (and other stuff) in
pyproject.toml. - Use
uvbecause it actually lets you create and use a lock file,uv.lock(and it’s fast). - As needed, run
uv syncand record all package versions, hashes, and download URLs in the lock file. - When building for prod, use
uv sync --frozenwhich will only use what’s in the lock file, installing the same package versions regardless of upstream changes. This uses the URLs directly, and won’t check for any newer versions of packages, even if the constraints would allow for updates.
PyPI is the free package hosting and distribution solution for Python, and it’s
a wonderful service to be grateful for. It’s where uv (and pip) look for by
default for any Python dependency. So usually uv.lock will contain quite a few
entries for things hosted there, and download URLs like
https://files.pythonhosted.org/packages/12/34/big-hash/my_pkg-X.Y.Z-py3-none-any.whl
which provide the location of my-pkg at version X.Y.Z.
But what if version X.Y.Z is compromised and malicious? Well, PyPI or a
maintainer can delete it, which removes the package from the index, and it won’t
be installable anymore.
Well, not quite.
The package does get removed from the index, true, but the URL remains active
as a result of how storage works, and the PyPI CDN will happily continue
serving it. Which means subsequent uv sync --frozen invocations will install
that malicious package, since this form doesn’t consult the registry anymore,
and just directly fetches the URL.
So, if during the window of time that an upstream package is compromised, one or more lock files get updated to point to that malicious URL, the bad package will still get installed. There can be a delay here until a defender might see these lock files deployed, especially given the nuance of unpinned versions and other things, so having an effective inventory of all installed packages to query is very important to properly address this. Package aging (delaying updates) and other mitigations are also part of the defensive story here, but ultimately it’s worth knowing what gets installed and where.
This isn’t really the fault of PyPI or uv, just how these two things
interplay. uv is thinking about the problem and how to fix it.