Making Single-binary Release with pp

作者:   發佈於: ,更新於:   #perl #cpan #compiling

pp comes with PAR::Packer, which is a tool for "compiling" a bunch of modules and codes and makes a single binary.

perldoc pp already contains a good amount of documentation one can refer to.

While it works with system perl, I found it even easier to first prepare a directory of local::lib, then just package that entire directory.

pp tries its best to determine the list of dependencies of the given program in various ways but none of those are guaranteed to be 100% accurate. As matter of fact that guarantee is impossible to make. This is partially due to the fact that a running program can load modules in ways that cannot be easily determined by reading the program source code, or in a relative hidden code path that cannot be easily captured.

That is a good amount of flexibility, but definitely a pricey one. Although, arguably it is also an issue brought by the tool (pp, perlcc, perl2exe, etc.). I guess that is because the dynamic nature is so convenient as long as all the installation are done right. Having a tool that is able to automically complie all dependencies together was not needed that much. It has definidently needed, that is why we have those tools now, but in the last mile of their completion, lies an undecidable problem.

So we will need to manually add those missing depedencies to pp command, which is fine only when the list is small. Since we wish to just pack all the declared dependencies together, we don't care that much even if that's going to make the result a little bigger than it has to be. If we can have the same build script that works everywhere, it is as happy as in Christmas. (An pure metaphoric experssion. Pesonally I feel nothing spceial in Dec 25.)

Anyway...... it turns out to be much easier to decide the dependency at installation time, since that's all well-declared and tools like cpm, or cpanm already does this perfectly. If we install dependencies in a self-contained directory, we could just archive the entire directory together with the program we are packing, and that should be good to go.

Let's say we cd into the source code of foo and we are trying to compile the program foo as a binary. The executable is at bin/foo, while its own moulders such as Foo.pm, Foo/Bar.pm are put under the conventional directory lib.

Given that, this script should produce foo as a single binary that as if all dependencies are "statically-linked" inside:

#!/bin/bash

# Prepare local/ 
cpanm -L local -q --installdeps .
# or: cpm install

perlversion=$(perl -MConfig -e 'print $Config{version}')
pp -B \
    -I ./local/lib/perl5 \
    -a "./local/lib/perl5/;$perlversion/" \
    -a lib \
    -o foo \
    bin/foo

Alas, this is almost perfect -- except modules in corelist might still be missing. They won't be inside local/ and if they are somehow not discovered by pp then they'll be missing in the end result. We won't know this until we manually test the result foo thoroughly. Basically we should always add a bunch of -M flags in the build script instead of assuming pp would do the right thing.

For example, like so, when all of Getopt::Long, JSON::PP, and Sys::Hostname are required.

pp -B \
    -M Getopt::Long:: \
    -M JSON::PP:: \
    -M Sys::Hostname:: \
    -I local/lib/perl5 \
    -a "./local/lib/perl5/;$perlversion/" \
    -a lib \
    -o foo \
    bin/foo

A rather tedious modification as the list of dependent modules now exists in two places in the repo. Surely there is some way to refactor this.

I've verified the following script build-minicpan.sh that can download the tarball of CPAN::Mini and build a working minicpan out of it:

#!/bin/bash
set -e

curl --silent -O https://cpan.metacpan.org/authors/id/R/RJ/RJBS/CPAN-Mini-1.111016.tar.gz

tar -xzf CPAN-Mini-1.111016.tar.gz

cd CPAN-Mini-1.111016

cpanm -n -q -L local --installdeps .

perlversion=$(perl -MConfig -e 'print $Config{version}')

pp -B \
   -M Getopt::Long:: \
   -I ./local/lib/perl5 \
   -a "./local/lib/perl5/;$perlversion/" \
   -a lib \
   -o ../minicpan \
   bin/minicpan

echo "DONE: minicpan"

To me this seems to be an area worth exploring... I've been experimenting the automation of this in a side-project: pau, which is a collection of shell functions that can install app to their own self-contained directory and expose just the program itself. Very similar to what pipx does. The support of pp was added not long ago but still, there is no good way to figure out all the missing modules and automatically add them as -M arguments.

Maybe as a lazy solution, we should just always produce a heavy-pack that includes the whole core lib directory (usually $INC[-1]) regardless whether any of them are used.

Maybe.