-
Notifications
You must be signed in to change notification settings - Fork 11
Investigate why distance from head is is more than expected #960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
2025-04-30
We're going to look into both of these. There is a gap between instance 5 to ~30. Looks like coordinated drop in participation. Curio thread The concern was that the distance from head is now 10 vs. in passive testing it was 9 5% of the time. We're focused on "our ship" first (manifest differences). 2025-04-29
Going the path of getting observer setup so can see who isn't participating
Hypothesis 1 : instances upgraded to the retracted version that won't being activated by contract
|
Checked pubsub settings in lotus, in relation to network name change in the activation manifest. The only difference i see in terms of code path execution in lotus is how the list of allowed topics is compiled here. |
Per filecoin-project/f3-activation-contract#22 (comment) , lets also capture a snapshot the minerIds that are participating so we can take diffs in future of further changes. |
Time spent checkpointing settled to a small value, and unlikely to be causing issues here. The initial delay in checkpointing was only observed during instance restart which is expected since the node was slightly behind on syncing the chain. After that checkpointing time reduced to a few milliseconds at 99th percentile. |
2025-05-01 standup update: A key thread is drop in participation. That is the main thread we'll pull on. To do that we will... Next steps
|
2025-05-06 standup: we agreed the actions in #960 (comment) should be done, but those are lower priority than other work items that have emerged. This issue is still part of https://github.com/filecoin-project/go-f3/milestone/7 |
Uh oh!
There was an error while loading. Please reload this page.
This is a tracking issue for investigating why F3 participation post activation is less than what we observed in passive testing hours before.
We went from 5 epochs behind on average to ~8 epochs behind on average.
Before activation:
https://grafana.f3.eng.filoz.org/d/edsu1k5s7gtfkb/f3-passive-testing?orgId=1&var-network=mainnet&var-instance=ida.f3.eng.filoz.org%3A80&from=1745798400000&to=1745884800000&viewPanel=56
After activation:
https://grafana.f3.eng.filoz.org/d/edsu1k5s7gtfkb/f3-passive-testing?orgId=1&var-network=mainnet&var-instance=ida.f3.eng.filoz.org%3A80&from=1745928000000&to=1746014400000&viewPanel=56
(note: I'm not showing one contiguous graph since there is a bootstrap phase which dramatically scales up the y-axis).
The text was updated successfully, but these errors were encountered: