Courtès: What’s in a package

Post Syndicated from original https://lwn.net/Articles/870047/rss

Over at the Guix-HPC blog, Ludovic Courtès writes about trying to package the PyTorch machine-learning library for the Guix distribution. Building from source in a user-verifiable manner is part of the philosophy behind Guix, but there were a number of problems that were encountered:

The first surprise when starting packaging PyTorch is that, despite being on PyPI, PyTorch is first and foremost a large C++ code base. It does have a setup.py as commonly found in pure Python packages, but that file delegates the bulk of the work to CMake.

The second surprise is that PyTorch bundles (or “vendors”, as some would say) source code for no less than 41 dependencies, ranging from small Python and C++ helper libraries to large C++ neural network tools. Like other distributions such as Debian, Guix avoids bundling: we would rather have one Guix package for each of these dependencies. The rationale is manifold, but it boils down to keeping things auditable, reducing resource usage, and making security updates practical.