Proposal for making Multi-Arch:same binNMU-safe (original) (raw)




Hello,

I am reaching out to the wanna build team (where I see changes needed), reproducible builds (as a driving force for deterministic build artifacts), sbuild team (where I see changes needed) and cross builders (as the main consumers of packages marked Multi-Arch: same).

Problem

When requesting a binNMU, the changelog entry is generated on the build daemon. The timestamp used is the time the build was started. When building a Multi-Arch: same package for multiple architectures, this timestamp typically differs between architectures. Due to reproducible builds, this timestamp is then transferred to SOURCE_DATE_EPOCH where it is used by various build systems for creating reproducible artifacts. Often times, it ends up in e.g. C header files. Those end up being reproducible, but architectures reproduce varying contents. This tends to break coinstallation.

Earlier, SOURCE_DATE_EPOCH was seeded from the most recent non-binNMU timestamp, but that happened to break Ian's backup system. It used to make Multi-Arch: same work though.

Proposed solution

Use the timestamp of when the binNMU was requested in the binNMU changelog.

When running wanna-build --binNMU (e.g. via wb wrapper), a row is inserted into the transactions table for each architecture setting the action to --binNMU. The first change here is using the same timestamp for all those rows. The time column already is not unique and this would coalesce a few more timestamps to the same values.

When a build daemon "takes" a build using API version 1, it receives a YAML object containing metadata about the build being performed. This happens to include the binNMU increment and the changelog text. This should be augmented with a new field named binNMU-timestamp. Existing autobuilders deal with unknown extra fields and ignore it.

To get the timestamp during the take action, there are two possible approaches. It mainly draws information from the packages table, but does not record the relevant timestamp. Therefore, another SQL query would be needed to query the transactions table for the most recent time a --binNMU was requested for the same package/version/architecture/distribution combination. This can work given that a newer binNMU cancels out an earlier one.

Alternatively, the packages table is augmented with a new column recording the timestamp when the binNMU was requested. This would be readily available in the subroutine add_one_building and could be emitted. In my opinion, this would be slightly cleaner at the cost of a schema change.

From here we leave the wanna-build source code and continue in sbuild. It already has --binNMU-timestamp and all that the buildd code would have to do here is check for the newly added field and pass it along using the new option. When the field is not provided, the option is skipped. Thus, this change would also be backwards compatible.

Updated wanna-build would work with old buidds and updated buildds would work with old wanna-build. Only when both are updated, the timestamp would be changed and an entire class of Multi-Arch: same file conflicts would go away.

Discussion

What do you think about this solution?

buildd team:

reproducible team:

sbuild team:

all:

Let me answer for cross builders that this would fix somewhere between 10 and 100 M-A:same file conflicts. This always is a latent problem that only surfaces once a package is binNMUed and therefore the full scale is difficult to gauge.

Ian's backup system would be happy with this as far as I can tell.

Helmut


Reply to: