You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
-40
Original file line number
Diff line number
Diff line change
@@ -341,46 +341,6 @@ More models, more fixes
341
341
* TinyNet models added by [rsomani95](https://github.com/rsomani95)
342
342
* LCNet added via MobileNetV3 architecture
343
343
344
-
### Nov 22, 2021
345
-
* A number of updated weights anew new model defs
346
-
*`eca_halonext26ts` - 79.5 @ 256
347
-
*`resnet50_gn` (new) - 80.1 @ 224, 81.3 @ 288
348
-
*`resnet50` - 80.7 @ 224, 80.9 @ 288 (trained at 176, not replacing current a1 weights as default since these don't scale as well to higher res, [weights](https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-rsb-weights/resnet50_a1h2_176-001a1197.pth))
* Groundwork in for FX feature extraction thanks to [Alexander Soare](https://github.com/alexander-soare)
361
-
* models updated for tracing compatibility (almost full support with some distlled transformer exceptions)
362
-
363
-
### Oct 19, 2021
364
-
* ResNet strikes back (https://arxiv.org/abs/2110.00476) weights added, plus any extra training components used. Model weights and some more details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-rsb-weights)
365
-
* BCE loss and Repeated Augmentation support for RSB paper
366
-
* 4 series of ResNet based attention model experiments being added (implemented across byobnet.py/byoanet.py). These include all sorts of attention, from channel attn like SE, ECA to 2D QKV self-attention layers such as Halo, Bottlneck, Lambda. Details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
367
-
* Working implementations of the following 2D self-attention modules (likely to be differences from paper or eventual official impl):
* A RegNetZ series of models with some attention experiments (being added to). These do not follow the paper (https://arxiv.org/abs/2103.06877) in any way other than block architecture, details of official models are not available. See more here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
* freeze/unfreeze helpers by [Alexander Soare](https://github.com/alexander-soare)
374
-
375
-
### Aug 18, 2021
376
-
* Optimizer bonanza!
377
-
* Add LAMB and LARS optimizers, incl trust ratio clipping options. Tweaked to work properly in PyTorch XLA (tested on TPUs w/ `timm bits`[branch](https://github.com/rwightman/pytorch-image-models/tree/bits_and_tpu/timm/bits))
378
-
* Add MADGRAD from FB research w/ a few tweaks (decoupled decay option, step handling that works with PyTorch XLA)
379
-
* Some cleanup on all optimizers and factory. No more `.data`, a bit more consistency, unit tests for all!
380
-
* SGDP and AdamP still won't work with PyTorch XLA but others should (have yet to test Adabelief, Adafactor, Adahessian myself).
381
-
* EfficientNet-V2 XL TF ported weights added, but they don't validate well in PyTorch (L is better). The pre-processing for the V2 TF training is a bit diff and the fine-tuned 21k -> 1k weights are very sensitive and less robust than the 1k weights.
382
-
* Added PyTorch trained EfficientNet-V2 'Tiny' w/ GlobalContext attn weights. Only .1-.2 top-1 better than the SE so more of a curiosity for those interested.
383
-
384
344
## Introduction
385
345
386
346
Py**T**orch **Im**age **M**odels (`timm`) is a collection of image models, layers, utilities, optimizers, schedulers, data-loaders / augmentations, and reference training / validation scripts that aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results.
Copy file name to clipboardExpand all lines: docs/archived_changes.md
+40
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,45 @@
1
1
# Archived Changes
2
2
3
+
### Nov 22, 2021
4
+
* A number of updated weights anew new model defs
5
+
*`eca_halonext26ts` - 79.5 @ 256
6
+
*`resnet50_gn` (new) - 80.1 @ 224, 81.3 @ 288
7
+
*`resnet50` - 80.7 @ 224, 80.9 @ 288 (trained at 176, not replacing current a1 weights as default since these don't scale as well to higher res, [weights](https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-rsb-weights/resnet50_a1h2_176-001a1197.pth))
* Groundwork in for FX feature extraction thanks to [Alexander Soare](https://github.com/alexander-soare)
20
+
* models updated for tracing compatibility (almost full support with some distlled transformer exceptions)
21
+
22
+
### Oct 19, 2021
23
+
* ResNet strikes back (https://arxiv.org/abs/2110.00476) weights added, plus any extra training components used. Model weights and some more details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-rsb-weights)
24
+
* BCE loss and Repeated Augmentation support for RSB paper
25
+
* 4 series of ResNet based attention model experiments being added (implemented across byobnet.py/byoanet.py). These include all sorts of attention, from channel attn like SE, ECA to 2D QKV self-attention layers such as Halo, Bottlneck, Lambda. Details here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
26
+
* Working implementations of the following 2D self-attention modules (likely to be differences from paper or eventual official impl):
* A RegNetZ series of models with some attention experiments (being added to). These do not follow the paper (https://arxiv.org/abs/2103.06877) in any way other than block architecture, details of official models are not available. See more here (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-attn-weights)
* freeze/unfreeze helpers by [Alexander Soare](https://github.com/alexander-soare)
33
+
34
+
### Aug 18, 2021
35
+
* Optimizer bonanza!
36
+
* Add LAMB and LARS optimizers, incl trust ratio clipping options. Tweaked to work properly in PyTorch XLA (tested on TPUs w/ `timm bits`[branch](https://github.com/rwightman/pytorch-image-models/tree/bits_and_tpu/timm/bits))
37
+
* Add MADGRAD from FB research w/ a few tweaks (decoupled decay option, step handling that works with PyTorch XLA)
38
+
* Some cleanup on all optimizers and factory. No more `.data`, a bit more consistency, unit tests for all!
39
+
* SGDP and AdamP still won't work with PyTorch XLA but others should (have yet to test Adabelief, Adafactor, Adahessian myself).
40
+
* EfficientNet-V2 XL TF ported weights added, but they don't validate well in PyTorch (L is better). The pre-processing for the V2 TF training is a bit diff and the fine-tuned 21k -> 1k weights are very sensitive and less robust than the 1k weights.
41
+
* Added PyTorch trained EfficientNet-V2 'Tiny' w/ GlobalContext attn weights. Only .1-.2 top-1 better than the SE so more of a curiosity for those interested.
42
+
3
43
### July 12, 2021
4
44
* Add XCiT models from [official facebook impl](https://github.com/facebookresearch/xcit). Contributed by [Alexander Soare](https://github.com/alexander-soare)
0 commit comments