Skip to content

[8.3.0] Repo contents cache #26129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 28, 2025
Merged

[8.3.0] Repo contents cache #26129

merged 5 commits into from
May 28, 2025

Conversation

Wyverald
Copy link
Member

Original commits:

Wyverald and others added 4 commits May 22, 2025 14:00
`FileSystemLock` now exposes two methods to get a lock:

*   `tryGet` that will return immediately or throw an exception if the lock cannot be acquired
*   `get` that will block until the lock can be acquired

This CL also uses `tryGet` everywhere that `FileSystemLock` was used before.

PiperOrigin-RevId: 757839607
Change-Id: Iada28f0ccd8b3fb80847b7e71c0438577e9372de
As the title suggests, this PR does two things:

1. It refactors `RepositoryDirectoryValue`.
   - This class used to have a `repositoryExists()` boolean method, and various other methods that throw at runtime depending on whether `repositoryExists()` is true or false. This PR changes it so that `RepositoryDirectoryValue` is a sealed interface with two impls -- `Success` and `Failure`.
   - The "success" case used to hold on to a digest. This digest is effectively a hash of the marker file, and it's not actually used anywhere. However, removing it causes tests to mysteriously start failing. Upon closer inspection, this is actually because it affects change pruning -- the marking file changing is almost 100% the same as the fetched contents changing. I can't think of a counterexample, but this seems slightly wrong, so I changed `RepositoryDirectoryValue` to never use change pruning instead (by implementing `NotComparableSkyValue`).
      - Note that this is potentially a subtle behavior change, but should improve correctness, if anything.
2. It renames `RepositoryCache` to `DownloadCache`. This is admittedly a mostly unrelated change... If any reviewer requests it, I'm happy to spend the time splitting this part into a separate PR.
   - It also adds a new `RepositoryCache` that holds onto the `DownloadCache` and the soon-to-be-introduced `RepoContentsCache`. The latter is just a "readonly" trivial implementation for now.

Work towards #12227.

Closes #25919.

PiperOrigin-RevId: 750740171
Change-Id: Id87180b75b4a856fa0734952105c178b7d41b74e
This mostly follows the design doc at https://docs.google.com/document/d/1ZScqiIQi9l7_8eikbsGI-rjupdbCI7wgm1RYD76FJfM/edit?pli=1&tab=t.0#heading=h.5mcn15i0e1ch

- Adds a new flag `--repo_contents_cache` to specify the path of the repo contents cache, where we'll store fetched contents of repos that can be cached safely across workspaces.
  - The flag defaults to `<value of --repository_cache>/contents`.
  - If this path ends up being inside the main repo, we throw an error. This is because files inside the source tree are considered immutable during the lifetime of a Bazel invocation, which means writing into the repo contents cache can cause spurious failures.
- A repo rule can indicate readiness for caching by returning `repository_ctx.repo_metadata(reproducible=True)` from its implementation function, similarly to module extensions.
  - Note that reproducibility/cacheability is not per repo rule (like "local"-ness), but per repo. Example: `http_archive` is only cacheable if the checksum is provided.
- Before we fetch a repo, we first check if matching entries exist in the repo contents cache under the key of the hash of all its predeclared inputs (`HVal(Pre(R))` in the doc). If not, fetch as normal.
   - These include repo attrs, repo rule impl .bzl hashes, Starlark semantics, etc.
- If matching entries exist, we go through each of them and examine if it's up-to-date by examining its "recorded inputs file" (analogous to the marker file in `outputBase/external`). If we find an up-to-date entry, we set up a symlink, and declare victory. Otherwise, fetch as normal.
- After fetching, if the repo rule indicates cacheability, we move the fetched contents into the repo contents cache by appending a new entry under the predeclared inputs hash key.

RELNOTES: Added a new flag `--repo_contents_cache` (defaults to the `contents` directory under the `--repository_cache`) where Bazel stores fetched contents of repos that can be safely cached across workspaces. A repo rule can indicate cacheability by returning `repository_ctx.repo_metadata(reproducible=True)` from its implementation function.

Work towards #12227.

Closes #25938.

PiperOrigin-RevId: 758682171
Change-Id: Ie703152a98745f7382c3d095a1dbf4b35c3408eb
Very simple GC implementation for the repo contents cache. Entry access is logged by "touching" the recorded inputs file. GC tasks then delete old entries that haven't been accessed in `--repo_contents_cache_gc_max_age` time.

Work towards #12227.

Closes #26080.

PiperOrigin-RevId: 761930313
Change-Id: I6c0b92771f57d9949380fb08698385c4a96fe7d4
@Wyverald Wyverald requested a review from a team as a code owner May 22, 2025 18:35
@github-actions github-actions bot added team-Android Issues for Android team team-Performance Issues for Performance teams team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels May 22, 2025
@Wyverald Wyverald enabled auto-merge May 22, 2025 18:35
@Wyverald Wyverald force-pushed the wyv-830-repocache branch from abed451 to 3775086 Compare May 27, 2025 19:24
@iancha1992 iancha1992 added this to the 8.3.0 release blockers milestone May 27, 2025
@Wyverald Wyverald force-pushed the wyv-830-repocache branch from 3775086 to eff6477 Compare May 27, 2025 21:17
@Wyverald Wyverald force-pushed the wyv-830-repocache branch from eff6477 to d929084 Compare May 28, 2025 00:55
@Wyverald Wyverald added this pull request to the merge queue May 28, 2025
Merged via the queue into release-8.3.0 with commit 33ee3b5 May 28, 2025
49 checks passed
@github-actions github-actions bot removed the awaiting-review PR is awaiting review from an assigned reviewer label May 28, 2025
@Wyverald Wyverald deleted the wyv-830-repocache branch May 28, 2025 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Android Issues for Android team team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants