GitHub - sethmlarson/pypi-data: Data about packages and maintainers on PyPI (original) (raw)

PyPI Data

Mostly up-to-date data about almost every package on PyPI

Get access to the database via GitHub releases.

$ gunzip pypi.db.gz $ sqlite3 'pypi.db' 'SELECT * FROM packages LIMIT 10 OFFSET 1000;'

acid-vault|1.3.2|>=3.6|1|0|2021-01-21 04:37:10 acidcli|1.0.1|>=3.6|0|0|2021-01-21 04:37:10 acidfile|1.2.1||0|0|2021-01-21 04:37:10 acidfs|1||0|0|2021-01-21 04:37:10 acidoseq|1.3.7||0|0|2021-01-21 04:37:10 acinonyx|0.1.0|>=3.6.0|0|0|2021-01-21 04:37:10 aciops|2.0.0|>=3.6|0|0|2021-01-21 04:37:10 acitoolkit|0.4||0|0|2021-01-21 04:37:10 ackeras|0.1.1||0|0|2021-01-21 04:37:10 ackg|0.0.5||0|0|2021-01-21 04:37:10

Data being tracked

Database Schemas

-- Packages -- CREATE TABLE packages ( name STRING, version STRING, requires_python STRING, yanked BOOLEAN DEFAULT FALSE, has_binary_wheel BOOLEAN, has_vulnerabilities BOOLEAN, first_uploaded_at TIMESTAMP, last_uploaded_at TIMESTAMP, recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, downloads INTEGER, scorecard_overall FLOAT, in_google_assured_oss BOOLEAN, PRIMARY KEY (name) );

-- Dependencies -- CREATE TABLE deps ( package_name STRING, extra STRING DEFAULT NULL, dep_name STRING, dep_specifier STRING, PRIMARY KEY (package_name, dep_name, dep_specifier) );

-- Wheel data -- CREATE TABLE wheels ( package_name STRING, filename STRING, build STRING, python STRING, abi STRING, platform STRING, uploaded_at TIMESTAMP, PRIMARY KEY (package_name, filename) );

-- Maintainer data -- CREATE TABLE maintainers ( name STRING, package_name STRING );

-- Package URLs -- CREATE TABLE package_urls ( package_name STRING, name STRING, url STRING, public_suffix STRING )

-- OpenSSF Scorecard -- CREATE TABLE scorecard_checks ( package_name STRING, name STRING, score INTEGER )

-- Trove Classifiers -- CREATE TABLE classifiers ( package_name TEXT, name TEXT, PRIMARY KEY (package_name, name), FOREIGN KEY (package_name) REFERENCES packages(name) )

Download data

Downloads are grabbed from https://github.com/hugovk/top-pypi-packages but only available for the top 5,000 packages.

Running locally

$ docker build -t pypi-data .
$ docker run --rm pypi-data

License

Apache-2.0