Manifest Confusion in PyPI
How some Python tools interpret dependencies differently.
This is my fourth blog post on PyPI, the others are:
How a package manager like Pip resolves a package name into a file to download and install has been covered in depth in previous blog posts. Manifest Confusion is caused due to a mismatch between metadata in PyPI and metadata in the files downloaded from PyPI. This causes various package managers and security tools to resolve dependencies differently. This could cause malicious or vulnerable packages to be installed, despite the tools or manual reviews saying they are safe.
Specifically, the metadata returned by pypi.org/pypi/<project>/json (JSON API) can differ from the dependencies declared inside the distribution file. As described in a previous post, there are two main types of distributions: source distributions (‘.tar.gz’, ‘.zip’) and binary distributions (‘.whl’). While there should only be one source distribution per release, there can be more than 20 binary distributions per release. Each of the distributions could specify different dependencies, and it could be different from the PyPI JSON API. While there could be legit use cases to vary the dependencies with different distributions (e.g. OS specific dependencies), for the most part it’s reasonable to assume that the dependencies should be consistent.
None of the tools tested appear to look at both the manifest (pypi.org/pypi/<project>/json) and all files for a given version (pypi.org/simple/<project>) to make sure that listed dependencies are consistent. If security decisions like outdated dependencies or malicious packages are decided based on the dependencies, then it’s important that what is actually installed matches what the tools report. Please note that an attacker exploiting this would have to have write permissions to the target project.
I made the test projects baffled-v4 and baffled-v5 to see how third parties indexed them. Each project has a source distribution and a binary distribution. I modified the distributions to have different dependencies, as well as modified Twine to allow for the metadata (that will be available in the JSON API) to be different than the distributions. The differences in dependencies are listed in the following table:
Pip and Poetry resolve dependencies differently:
Pip and pip-tools will only use what's inside the resolved file. So different distributions in the same release can have different dependencies.
Dependency and Security Tools
Some security tools look at the packages actually being downloaded rather than trying to resolve the dependencies. Those types of tools are not affected by Manifest Confusion.
Snyk and socket.dev fetch the dependencies from the JSON API, while deps.dev appears to fetch the dependencies from the binary distribution (same as Pip would install). pip-audit resolves dependencies in a virtual environment using Pip.
Links to scan results:
The Distribution Confusion technique from the previous blog post can be combined with Manifest Confusion to further complicate how dependencies are resolved. E.g. Distribution Variants could only differ in the dependencies, enabling very subtle downgrade attacks.
Time of Scan
While I did not look closely at how the tools index a package over time, I noticed that the tools were slow to index. It’s also unclear if any of them would update results over time if there are changes to the distributions. Given that distributions can be added, deleted and changed (Distribution Confusion) there are a number of ways the initial scan results could differ from the actual dependencies.
This research was shared with the PyPI maintainers in July 2023. They pointed out that the ability for Manifest Confusion was already publicly known. They also said they don’t recommend using the JSON API to resolve dependencies; PEP 658 should be used instead. I would also like to thank Stig Palmquist and others at Hackeriet for inspiring and giving feedback on this work.
Tools to list outdated/malicious dependencies might interpret dependencies differently than the tool being used to install them. E.g. if Snyk is used with Pip or deps.dev is used with Poetry.
Tools might have an incorrect view of the dependencies because they don’t verify consistency between different ways to resolve dependencies.