Reproducible Debian-live images

This is a description of my work in progress in making Debian-live images reproducible. This page will get updated when I have made some progress.

Update 2020-12-06

I’ve finished MR 216 https://salsa.debian.org/live-team/live-build/-/merge_requests/216. Next up: MR 218, the reproducibility fixes.

Update 2020-11-18

Thanks for the responses so far. I’m preparing merge requests which can be applied to the repository piece-by-piece.

Detailed HOWTO

The basic script/commands:

# Running from ramdisk
# Running as root
export LIVE_BUILD=/home/roland/git/live-build
export SOURCE_DATE_EPOCH=918014706
cd /dev/shm
mkdir live
cd live
mount /dev/shm -odev,exec,remount

# A very basic configuration
lb clean
lb config --apt-http-proxy http://localhost:3142
lb build

# The comparison is made with diffoscope
mv live-image-amd64.hybrid.iso /somewhere_else/live-918014706-vXX.iso
diffoscope --html-dir /softwhere_else/html /somewhere_else/live-918014706-vXX-1.iso /somewhere_else/live-918014706-vXX.iso
  • SOURCE_DATE_EPOCH must be after 1980-01-01 for the EFI images
  • SOURCE_DATE_EPOCH is generated by date +%s --date 1999-02-03T04:05:06Z, which is a recognizable timestamp
  • Running from a ramdisk is an enormous speedup
  • Current focus: building the image, not the content or runnability of the image

Development machine setup

  • Debian/sid
  • apt-cacher-ng running on http://localhost:3142

Ideas

  • Turn off the Internet access while building an image, to ensure that a minimum amount of network traffic is generated
  • Use a snapshot instead of Debian Stable for the image, to make it easier to compare older images -> done
  • Inside the image: use auto-apt-proxy (or squid-deb-proxy-client)
  • Does the image need the file /var/cache/ldconfig/aux-cache, as it is changed between builds? It is generated by ldconfig. If the file is removed and ldconfig is executed, the contents can be regenerated and is identical even after some time.

List of files in scripts/build

With git commit:
binary_checksums
binary_disk -> needs review for values of LB_DEBIAN_INSTALLER
binary_grub_cfg
binary_grub-efi
binary_iso
binary_loadlin
binary_manifest
binary_rootfs
binary_syslinux
binary_win32-loader
chroot
efi-image
installer_debian-installer

No change required:
binary
binary_chroot

Under Review:

Local modifications:
binary_grub-pc -> Not in use for the default config
binary_memtest -> Not in use for the default config --memtest memtest86+
chroot_hacks -> Remove files from the chroot

Not investigated yet:
binary_grub-legacy
binary_hdd
binary_hooks
binary_includes
binary_linux-image
binary_netboot
binary_onie
binary_package-lists
binary_tar
binary_zsync
bootstrap
bootstrap_archives
bootstrap_cache
bootstrap_debootstrap
build
chroot_apt
chroot_archives
chroot_cache
chroot_debianchroot
chroot_devpts
chroot_dpkg
chroot_firmware
chroot_hooks
chroot_hostname
chroot_hosts
chroot_includes
chroot_install-packages
chroot_interactive
chroot_linux-image
chroot_package-lists
chroot_prep
chroot_preseed
chroot_proc
chroot_resolv
chroot_selinuxfs
chroot_sysfs
chroot_sysv-rc
chroot_tmpfs
config
grub-cpmodules
installer
installer_preseed
source
source_checksums
source_debian
source_disk
source_hdd
source_hooks
source_iso
source_live
source_tar

Earlier work:

Pass along SOURCE_DATE_EPOCH to the chroot: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=832998 -> will be extended with the merge request

A summary of the current state of the Debian ISO images: https://lists.reproducible-builds.org/pipermail/rb-general/2020-August/002018.html

This blog is mentioned in the September 2020 Reproducible Builds blog: https://reproducible-builds.org/reports/2020-09/

Merge requests

The foundation: https://salsa.debian.org/live-team/live-build/-/merge_requests/209

Merge request 209 is superseded by https://salsa.debian.org/live-team/live-build/-/merge_requests/210 which also contains the changes from 209.
I just noticed the comment in 209, I’ll take a look how I can avoid a lot of ‘Touch’ commands.

Using lb build I am now able to have all files and folders on the image with the timestamp.

The files that are different (their content) are: .disk/archive_trace, var/cache/apt/pkgcache.bin, var/cache/apt/srcpkgcache.bin, var/cache/ldconfig/aux-cache, sha256sum.txt

Issues found

Reported

binary_syslinux: The sed -e ‘d’ commands with ‘#’ will not work, a slash is needed, fix for 7ffd2288d944840937f556bd56703ba381f4edcc (2015-01-15) and 578dbee516a370935e1b2e49205c524370e1f8d0 (2015-01-29) -> https://salsa.debian.org/live-team/live-build/-/merge_requests/211

Not reported yet

Package apt-cacher-ng: the file /etc/apt-cacher-ng/acnf.conf needs the following line to allow the file .disk/archive_trace to be filled properly:

VfilePatternEx: /project/trace/ftp-master\.debian\.org$

When preserving the timestamps for loadlin the question arises: are these install.bat files correct? The path seems incorrect to me. When running loadlin.exe on Windows10 2004, 64-bit, the 16-bit executable was rejected by Windows.

When running the installer in expert mode in Dutch on a GB-Windows 10, in text mode, I get the graphical installer, not the text installer.

2020-09-30 Using the snapshot repository

lb config --parent-mirror-bootstrap http://localhost:3142/snapshot.debian.org/archive/debian/20200919T085932Z --parent-mirror-binary http://localhost:3142/snapshot.debian.org/archive/debian/20200919T085932Z --security false --updates false

This needs the modified acnf.conf file (as mentioned above). The URLs need the localhost:3142 part, because debootstrap does not use the apt-http-proxy value. Security and updates have been disabled to have the minimal amount of external influences for this run.

Open point: Inside the image, the direct link to snapshot.debian.org should be used

2020-09-30 Using –interactive

When using --interactive, a shell is created before the binary image is constructed

rm /var/cache/apt/*.bin
rm /var/cache/ldconfig/aux-cache

Results:

  • live/filesystem.packages does not have the correct timestamp any more.
  • root/.bash_history gets created in the squashfs image
  • Several folders get a new ‘size’
  • /var/cache/apt/pkgcache.bin is still in the squashfs image (probably due to the installation of mkisofs)

Conclusion:

--interactive helps, but is not the final solution.

2020-09-30 Using chroot_hacks

chroot_hacks is the last step in the chroot structure. An additional step between binary_checksums and binary_iso might be needed to delete the files from /var/cache.

2020-10-05 Basic ‘lb build’ with snapshot is building reproducible

Success. The minimal lb build (with a local cache) results in a reproducible image.
The image contains the regular snapshot repository URL.

lb config --apt-http-proxy http://localhost:3142 --parent-mirror-bootstrap http://localhost:3142/snapshot.debian.org/archive/debian/20200919T085932Z --parent-mirror-binary http://snapshot.debian.org/archive/debian/20200919T085932Z --security true --parent-mirror-chroot-security http://localhost:3142/snapshot.debian.org/archive/debian-security/20200919T085932Z --parent-mirror-binary-security http:///snapshot.debian.org/archive/debian-security/20200919T085932Z --updates true --apt-options "-y -o Acquire::Check-Valid-Until=false" --distribution buster

Info from 2020-10-06

The next step: finding the date of the 10.6 live image

There are many timestamps in the live image (created by live-wrapper).

cat .disk/info
Debian Live 10.6 official: 2020-09-26T10:36
TZ="UTC" unsquashfs -lls filesystem.squashfs | sort -k 4,5
-rw-r--r-- root/root 228576 2020-09-26 09:14 squashfs-root/var/lib/apt/lists/local-mirror.cdbuilder.debian.org_debian_dists_buster_contrib_binary-amd64_Packages
-rw-r--r-- root/root 29374021 2020-09-26 09:15 squashfs-root/var/lib/apt/lists/local-mirror.cdbuilder.debian.org_debian_dists_buster_main_i18n_Translation-en
-rw-r--r-- root/root 40370452 2020-09-26 09:15 squashfs-root/var/lib/apt/lists/local-mirror.cdbuilder.debian.org_debian_dists_buster_main_source_Sources
-rw-r--r-- root/root 44651743 2020-09-26 09:15 squashfs-root/var/lib/apt/lists/local-mirror.cdbuilder.debian.org_debian_dists_buster_main_binary-amd64_Packages
-rw-r--r-- root/root 121480 2020-09-26 10:06 squashfs-root/var/lib/apt/lists/local-mirror.cdbuilder.debian.org_debian_dists_buster_InRelease
unsquashfs -mkfs-time filesystem.squashfs
1601116542
root@silent:/dev/shm/x/squashfs-root/var/lib/apt/lists# grep "Date: " *
local-mirror.cdbuilder.debian.org_debian_dists_buster_InRelease:Date: Sat, 26 Sep 2020 09:54:48 UTC

It turns out that the local-mirror was slightly faster than the snapshot mirror. Kernel 4.19.0-11 got added in snapshot.debian.org on 20200926T104248Z

date +%s --date 2020-09-26T10:42:48Z
1601116968
lb config --apt-http-proxy http://localhost:3142 --parent-mirror-bootstrap http://localhost:3142/snapshot.debian.org/archive/debian/20200926T104248Z --parent-mirror-binary http://snapshot.debian.org/archive/debian/20200926T104248Z --security true --parent-mirror-chroot-security http://localhost:3142/snapshot.debian.org/archive/debian-security/20200926T104248Z --parent-mirror-binary-security http://snapshot.debian.org/archive/debian-security/20200926T104248Z --updates true --apt-options "-y -o Acquire::Check-Valid-Until=false" --distribution buster
echo "live-task-standard" > config/package-lists/desktop.list.chroot

Diffoscope is having a hard time comparing the live-wrapper official live image and my local live image, which is generated by live-build:
diffoscope debian-live-10.6.0-amd64-standard.iso live-st02.iso --html-dir htmlout --max-diff-block-lines-saved 1000 --max-diff-input-lines 1000 --max-diff-block-lines 300 --max-page-size 1000000

2020-10-08

The basic image is reproducible, contains some hacks. Next step: working on --debian-installer live

lb config --apt-http-proxy http://localhost:3142 --parent-mirror-bootstrap http://localhost:3142/snapshot.debian.org/archive/debian/20200926T104248Z --parent-mirror-binary http://snapshot.debian.org/archive/debian/20200926T104248Z --security true --parent-mirror-chroot-security http://localhost:3142/snapshot.debian.org/archive/debian-security/20200926T104248Z --parent-mirror-binary-security http://snapshot.debian.org/archive/debian-security/20200926T104248Z --updates true --apt-options "-y -o Acquire::Check-Valid-Until=false" --distribution buster --debian-installer live

git push: https://salsa.debian.org/rclobus-guest/live-build/-/merge_requests/new?merge_request%5Bsource_branch%5D=rclobus%2Freproducible-foundation

Some major differences between the official Debian Live Standard image and the lb build image:

  • udeb location: pool vs pool-udeb
  • installer location: d-i vs installer
  • /etc/machine-id in official image
  • /etc/default/console-setup: iso-8859-15 vs utf-8
  • /etc/apt/sources.list: only deb.debian.org vs deb-src + updates + security
  • isolinux/menu.cfg: many languages vs none
  • efi vs EFI
  • _DISK vs .disk
  • No Joliet vs Joliet

2020-10-31

/etc/shadow: The gid are ‘days since 1970-01-01’. This should be adjusted.

installer_debian-installer: I’ve found the reason why the files /dev/lock and /dev/lock-frontend got added to the squashfs image. A MR is pending.

I’m getting close to publishing the result to the mailing lists. Cleanup of some local hacks is required.

2020-11-11

The mail is finally sent. The standard image looks as close a possible to the current official live image.

Non-reproducible: /var/lib/systemd/catalog/database.

Testing with Windows 10 1909 (in a VM): The image is called ‘Debian buster 20‘ -> 16 characters. /firmware is an empty directory. /debian does not exist, /dists only contains buster. All these issues are explained by the lack of support for RockRidge on Windows.

The image contains deb.debian.org for regular updates, so no references to snapshot.debian.org are inside the image.

The installer doesn’t work: win32-loader.ini refers to the folder install, which got renamed to d-i to match the live-wrapper image.

2020-11-18

Many thanks to the people who have responded so far. As per request, I’ve prepared a few merge requests, which each contain an isolated feature/bugfix.

Merge Request 1: https://salsa.debian.org/live-team/live-build/-/merge_requests/215
Bugfix for LB_DERIVATIVE

Merge Request 2: https://salsa.debian.org/live-team/live-build/-/merge_requests/216
Bugfix for the files lock and lock-frontend

Merge Request 3: https://salsa.debian.org/live-team/live-build/-/merge_requests/217
Bugfix: Live installer can run without LB_CACHE_PACKAGES

Merge Request 4: https://salsa.debian.org/live-team/live-build/-/merge_requests/218
Reproducible framework

2020-12-06

MR215 and 217 have been added to upstream/master. MR216 was finished 2020-12-03 and is pending a merge. Next: MR218

MR218 (reproducibility)

The following scenarios need to be investigated, each with 2 builds:

  1. Set EPOCH_SOURCE_DATE, run lb config, run lb build
  2. Run lb config, set EPOCH_SOURCE_DATE, run lb build
  3. Run lb config, run lb build

For each scenario, the effect of the timestamp must be explainable.

  1. This is the scenario for a reproducible build
    The configuration file and the ISO file use the same timestamp
  2. The LB_ISO_VOLUME variable will be fixed to the timestamp of the moment the configuration is created -> not good. The timestamp of lb build should have been used. Fixed in c23105b1e3c8d11fc8fc92bd84f0156d280fc54b
    Next issue: run lb config twice -> date: invalid date ‘@’. Instead of placing a timestamp in the config file, I would use a similar strategy as with @LB_VERSION@.

New merge requests:

https://salsa.debian.org/live-team/live-build/-/merge_requests/220: Bugfix: use minutes instead of month in the time of the modification date field