Skip to content

Platform package repository snapshots (and support for build metadata in package versions) #796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 8, 2025

Conversation

dzuelke
Copy link
Contributor

@dzuelke dzuelke commented May 7, 2025

This handles packages such as "php-8.4.6+build2", which will allow re-building of existing packages without overwriting them in a published repository (a pre-requisite for stable repository snapshots).

Because semver ignores build metadata for version comparison, if multiple packages with the same version but different build metadata exists, mkrepo.sh will only include the "highest" build metadata variant (sorted using "natural", version-like sorting, so "+build11" is higher than "+build9" etc).

For snapshots, a hash is calculated based on the sorted list of formulae in support/build/. If a formula is renamed or added, the hash changes. During repository upload, mkrepo.sh will create a packages-${hash}.json in addition to packages.json (which still serves as the "bleeding edge" repository, as before).

In bin/compile, the same hash is used to calculate a repository URL.

Together with the changes for build metadata variants in package formulae versions described above, this allows stable repository snapshots, including re-builds of existing published package versions, that are not picked up by published buildpack releases, since they would reference a different snapshot hash in the URL to packages-${hash}.json.

CI now requires platform repo snapshots to be present on the develop bucket/prefix for pull request runs. This still allows manual workflow_dispatch runs without a snapshot, e.g. during development/debugging. Tags and main run against the default repo, as usual, which now uses snapshots. The Prepare Release workflow enforces the presence of a snapshot on the stable bucket/prefix as an additional measure.

Syncing from develop to stable creates the snapshot in the stable bucket, too. A sync is rejected if the destination already has the calculated snapshot. This helps prevents accidental sync runs using e.g. main instead of the branch with the formulae changes; this concern is going away longer term once we no longer treat "all present packages" as the canonical form of a repository, but instead explicitly build repository snapshots from a known list of packages that are expected to be present.

GUS-W-18461573
GUS-W-18463967

@dzuelke dzuelke requested a review from a team as a code owner May 7, 2025 17:56
@dzuelke dzuelke force-pushed the platform-repo-snapshots branch 2 times, most recently from 452f93a to a8d530f Compare May 8, 2025 00:12
Copy link
Member

@Malax Malax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I'm missing some context for the finer details but the code has comments to explain these. But I did not verify that the comments were correct.

The only thing that stood out to me are the inconsistencies in the test fixture .json files when it comes to trailing newlines. They probably should have them. I'll approve this PR nonetheless to avoid unnecessary re-review loops. Please consider fixing those before merging. :)

dzuelke added 11 commits May 8, 2025 14:12
Really only expected_platform_composer.json should have no trailing newline at EOF because that is how they get generated and we imported them verbatim from the generator calls.
Less confusing if the two s3:// URLs are indented the same.
This handles packages such as 'php-8.4.6+build2', which will allow re-building of existing packages without overwriting them in a published repository (a pre-requisite for stable repository snapshots).

Because semver ignores build metadata for version comparison, if multiple packages with the same version but different build metadata exists, mkrepo.sh will only include the "highest" build metadata variant (sorted using "natural", version-like sorting, so "+build11" is higher than "+build9" etc).

GUS-W-18461573
easiest to tell apart which is getting installed using build metadata in the version string
just a bit tidier, they are not formulae
A hash is calculated based on the sorted list of formulae in support/build/. If a formula is renamed or added, the hash changes. During repository upload, mkrepo.sh will create a packages-${hash}.json in addition to packages.json (which still serves as the "bleeding edge" repository, as before).

In bin/compile, the same hash is used to calculate a repository URL.

Together with the previous changes for build metadata variants in package formulae versions, this allows stable repository snapshots, including re-builds of existing published package versions, that are not picked up by published buildpack releases, since they would reference a different snapshot hash in the URL to packages-${hash}.json.

CI now requires platform repo snapshots to be present on the develop bucket/prefix for pull request runs. This still allows manual workflow_dispatch runs without a snapshot, e.g. during development/debugging. Tags and main run against the default repo, as usual, which now uses snapshots. The Prepare Release workflow enforces the presence of a snapshot on the stable bucket/prefix as an additional measure.

GUS-W-18463967
@dzuelke dzuelke force-pushed the platform-repo-snapshots branch from a8d530f to f4db1b4 Compare May 8, 2025 12:12
@dzuelke
Copy link
Contributor Author

dzuelke commented May 8, 2025

Yes, well spotted @Malax, I tidied those up (also in any other fixtures that were missing them, except for all the expected_platform_composer.jsons, which are verbatim generated like that without a trailing newline).

dzuelke added 5 commits May 8, 2025 14:30
…has nothing to do otherwise

Useful e.g. if obsolete formulae (that were never built in the destination) got removed, say, on stack EOL. Or if the way the snapshot hash is computed ever changes, but not the formulae themselves. Both of these of course require manual copying of the packages-SNAPSHOT.json files in the source bucket, but their existence is already guaranteed by the CI rules that force snapshot usage on Pull Requests.
Now that we have snapshots, packages will never be removed; the platform-remove workflow is only for iterating during development.
We definitely do not want to cancel in-progress syncs for other stacks if one fails for whatever reason, that could leave a bucket/repo in a bad state.
Just internal release tooling in there.
@dzuelke dzuelke force-pushed the platform-repo-snapshots branch from f4db1b4 to 0f0a507 Compare May 8, 2025 12:30
@dzuelke
Copy link
Contributor Author

dzuelke commented May 8, 2025

Sync (only to create snapshots on dist-${STACK}-stable/): https://github.com/heroku/heroku-buildpack-php/actions/runs/14906653805

@dzuelke dzuelke merged commit 5f00b09 into main May 8, 2025
22 checks passed
@dzuelke dzuelke deleted the platform-repo-snapshots branch May 8, 2025 12:42

When updating the build for an existing, published version of a package (e.g. with updated compile options), it is recommended to use *build metadata* in the version string of the updated package in order to distinguish it from the existing, published version. For example, `php-8.4.6` would become `php-8.4.6+build2`.

Per Semver specifications, any build metadata in a version string is ignored during version comparison. This means it could not be guaranteed that `php-8.4.6+build2` is picked, if it exists in the same repository as `php-8.4.6`. For this reason, when [(Re-)generating Repositories](re-generating-repositories), only the package/version combination with the "highest" (compared using lenient version string comparison) build metadata will be included, meaning that in the example above, `php-8.4.6` would not be included in the generated repository, only `php-8.4.6+build2`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link is broken:

$ curl -I -s https://github.com/heroku/heroku-buildpack-php/blob/main/support/build/re-generating-repositories | grep HTTP
HTTP/2 404
$ curl -I -s https://github.com/heroku/heroku-buildpack-php/blob/0f0a5075ae34b471ca17bbc2c05bb88e8c01ca3e/support/build/re-generating-repositories | grep HTTP
HTTP/2 404

I think you need a # so it's an anchor link. That's what the link in the ### Repository Snapshots section uses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sent a PR to fix #800

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants