What's Changed
- Sow top activations based on absolute value. by @copybara-service in #4670
- Add support for layer-specific rope scale factors. by @copybara-service in #4672
- Automatic model selection for Gemma 3 models. by @copybara-service in #4671
- Make LoRA's dtype arg useful by @IvyZX in #4681
- [NVIDIA] Support FP8 Einsum Op by @kaixih in #4686
- [nnx] remove deprecated APIs by @cgarciae in #4627
- Add
attention_bias
parameter toMultiHeadDotProductAttention
. by @copybara-service in #4694 - Unit tests for
attention_bias
parameter toMultiHeadDotProductAttention
. Add parameter to all overloads to make pytype happy. by @copybara-service in #4702 - Rollback of attention_bias parameter, because the change overrides the attention bias for injected attention functions. by @copybara-service in #4703
- Add custom einsum op to Einsum() by @IvyZX in #4705
- [nnx] refactor GraphDef by @cgarciae in #4630
- Make fully replicated array before saving checkpoints for examples that use pmap. by @copybara-service in #4707
- Fix CI by @cgarciae in #4716
- remove "nnx" collection in ToLinen by @copybara-service in #4708
- [nnx] flaxlib types by @cgarciae in #4639
- v0.10.6 by @cgarciae in #4724
Full Changelog: v0.10.5...v0.10.6