What is requirements.txt
The requirements.txt file is the way python developers have specified their dependencies for a long time. It allows you to specify depndencies by package name and version in an easy to read text file, like so:
certifi==2020.4.5.1
chardet==3.0.4
idna==2.9
requests==2.23.0
urllib3==1.25.9
Problems with requiremens.txt
For some developers, the format of requirements.txt
fails to meet several needs.
In particular, it does not specify a hash for the installed package.
This means that if a malicious user got access to the requests
pypa account, they could re-publish version 2.23.0
and pip
would happily install it into your or another developer’s system as soon as they ran pip install -r requirements.txt
.
Obviously this is a security vulnerability we would like to avoid.
The solution: lock files
Lock files allow us to create deterministic python builds. Similar to requirements.txt
we can specify a version or version range for a given package, but lock files also include a hash of the installed package so that we know the version number we specified always corresponds to the same package code we originally installed.
There are two popular types of lock files in use right now: Pipfile.lock
and poetry.lock
. They are used by the pipenv
and poetry
libraries respectively.
I am generally a big fan of both projects. I use pipenv
for my django development and poetry
for package development.
Here is an example of a Pipfile.lock
that describes the same packages as in our requirements.txt
above:
{
"_meta": {
"hash": {
"sha256": "acbc8c4e7f2f98f1059b2a93d581ef43f4aa0c9741e64e6253adff8e35fbd99e"
},
"pipfile-spec": 6,
"requires": {
"python_version": "3.8"
},
"sources": [
{
"name": "pypi",
"url": "https://pypi.org/simple",
"verify_ssl": true
}
]
},
"default": {
"certifi": {
"hashes": [
"sha256:1d987a998c75633c40847cc966fcf5904906c920a7f17ef374f5aa4282abd304",
"sha256:51fcb31174be6e6664c5f69e3e1691a2d72a1a12e90f872cbdb1567eb47b6519"
],
"version": "==2020.4.5.1"
},
"chardet": {
"hashes": [
"sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
"sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
],
"version": "==3.0.4"
},
"idna": {
"hashes": [
"sha256:7588d1c14ae4c77d74036e8c22ff447b26d0fde8f007354fd48a7814db15b7cb",
"sha256:a068a21ceac8a4d63dbfd964670474107f541babbd2250d61922f029858365fa"
],
"version": "==2.9"
},
"requests": {
"hashes": [
"sha256:43999036bfa82904b6af1d99e4882b560e5e2c68e5c4b0aa03b655f3d7d73fee",
"sha256:b3f43d496c6daba4493e7c431722aeb7dbc6288f52a6e04e7b6023b0247817e6"
],
"index": "pypi",
"version": "==2.23.0"
},
"urllib3": {
"hashes": [
"sha256:3018294ebefce6572a474f0604c2021e33b3fd8006ecd11d62107a5d2a963527",
"sha256:88206b0eb87e6d677d424843ac5209e3fb9d0190d0ee169599165ec25e9d9115"
],
"version": "==1.25.9"
}
},
"develop": {}
}
Notice how all the same packages and package versions are present, but they now have hashes attached to them.
There are other advantages to using poetry
or pipenv
over plain pip
, but that discussion is for another time.