fix(fr32): correctly calculate the todo size under low core counts #12884

tediou5 · 2025-02-10T06:52:24Z

Related Issues

Fixes #9324
Based on #12491.

Proposed Changes

Currently, the unpadReader's work size is calculated using MTTresh * mtChunkCount(sz). When the core count is low, this may cause the todo length to exceed len(work).

In the current implementation, the length of out is actually expanded automatically by Go's underlying mechanisms, making adjustments quite troublesome.

// io.go
func ReadAll(r Reader) ([]byte, error) {
  b := make([]byte, 0, 512)
  for {
    n, err := r.Read(b[len(b):cap(b)])
    b = b[:len(b)+n]
    if err != nil { /* ... */ }

    if len(b) == cap(b) {
      // Add more capacity (let append pick how much).
      b = append(b, 0)[:len(b)] // <-- here
    }
  }
}

There are two feasible solutions:

Increase the minimum mtChunkCount to 16.
Ensure that todo does not exceed the size of work.

I'm not entirely sure if other conditions could also lead to similar issues, so for now, I've chosen the second approach. I added a check when calculating todo to prevent overflow:

todo := min(abi.PaddedPieceSize(outTwoPow), abi.PaddedPieceSize(len(r.work)))

Additionally I’ve also fixed the CI.

Additional Info

Checklist

Before you mark the PR ready for review, please make sure that:

Commits have a clear commit message.
PR title conforms with contribution conventions
Update CHANGELOG.md or signal that this change does not need it per contribution conventions
New features have usage guidelines and / or documentation updates in
- Lotus Documentation
- Discussion Tutorials
Tests exist for new functionality or change in behavior
CI is green

BigLep · 2025-05-27T06:26:31Z

I'm not up on the specifics of this PR, but I know it's been open for a long time. What is the owrst case that could happen if this code is "wrong"? I'm trying to gage if we'd be better off to just merge than keep it open.

Copilot

Pull Request Overview

This PR ensures the unpadReader never schedules more work than its buffer can hold under low core counts by clamping the todo size, and it bolsters the test suite with reliability checks and a new edge-case test.

Clamp the todo chunk size in readInner to the work buffer length
Add TestUnpadReaderBufWithSmallWorkBuf to verify behavior with very small buffers
Strengthen existing tests by validating the number of bytes read from rand.Read

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
storage/sealer/fr32/readers.go	Clamp `todo` to `len(r.work)` in `readInner` to prevent overflow
storage/sealer/fr32/readers_test.go	Add new buffer-size edge-case test and tighten randomness reads in tests

Comments suppressed due to low confidence (1)

storage/sealer/fr32/readers_test.go:40

[nitpick] The test name is a bit verbose; renaming it to TestUnpadReaderWithSmallWorkBuf would align better with existing test naming patterns and improve readability.

func TestUnpadReaderBufWithSmallWorkBuf(t *testing.T) {

storage/sealer/fr32/readers.go

Co-authored-by: Copilot <[email protected]>

tediou5 · 2025-05-28T01:36:03Z

I'm not up on the specifics of this PR, but I know it's been open for a long time. What is the owrst case that could happen if this code is "wrong"? I'm trying to gage if we'd be better off to just merge than keep it open.

There's less data to read each time, which affects performance, I guess.

rvagg · 2025-05-28T01:40:44Z

I've found it hard to context switch deep enough to grok either this or the previous attempt to edit this code so it's hard for me to weigh in.

There's less data to read each time, which affects performance, I guess.

This seems right but I don't have time to validate it. I think if we just go ahead and roll this out, we'd find out about problems with it if we started hearing about CommP mismatches. We might just have to 🤞 and go ahead unless we have someone else who has the brainspace to drop in here and grok what this code is doing and give a proper 👍 . Maybe Kuba could do that when he's back from his break?

fix(fr32): correctly calculate the todo size under low core counts

ffc83c5

rjan90 added the skip/changelog This change does not require CHANGELOG.md update label Feb 10, 2025

rjan90 requested a review from magik6k February 10, 2025 07:46

rjan90 assigned tediou5 Feb 10, 2025

test(fr32): test UnpadReaderBuf with a small work buf

f775875

BigLep requested a review from Copilot May 27, 2025 06:25

Copilot AI reviewed May 27, 2025

View reviewed changes

storage/sealer/fr32/readers.go Show resolved Hide resolved

fix(f32): add comment

bd42b02

Co-authored-by: Copilot <[email protected]>

BigLep requested a review from Kubuxu May 28, 2025 04:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(fr32): correctly calculate the todo size under low core counts #12884

fix(fr32): correctly calculate the todo size under low core counts #12884

tediou5 commented Feb 10, 2025 •

edited

Loading

Uh oh!

BigLep commented May 27, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

tediou5 commented May 28, 2025

Uh oh!

rvagg commented May 28, 2025

Uh oh!

Uh oh!

fix(fr32): correctly calculate the todo size under low core counts #12884

Are you sure you want to change the base?

fix(fr32): correctly calculate the todo size under low core counts #12884

Conversation

tediou5 commented Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Proposed Changes

Additional Info

Checklist

Uh oh!

BigLep commented May 27, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

tediou5 commented May 28, 2025

Uh oh!

rvagg commented May 28, 2025

Uh oh!

Uh oh!

tediou5 commented Feb 10, 2025 •

edited

Loading