Why is there no normal categorical cross-entropy loss? #4668

yasharhon · 2025-03-31T08:02:18Z

yasharhon
Mar 31, 2025

This question concerns Optax just as much as Flax, but as the issue seems to stem from expections from Flax models, I am asking it here.

I was following this example in the JAX AI stack documentation, and noticed something odd. When implementing a classifier for images of digits, the model does not by default output softmaxxed logits. Instead, softmaxxing is invoked when calling optax.softmax_cross_entropy_with_integer_labels.

Given that optax does not even include a normal categorical cross entropy (between predicted and true labels), I assume it is expected that a JAX model for a classifier always outputs logits. But softmaxxing ensures that my model output conforms to the properties I would expect of a probabilistic classifier, and the model is trained as if the output is softmaxxed through the loss function. So is there a reason why such a model should be expected to output logits?

Answered by carlosgmartin

Apr 13, 2025

Computing cross-entropy directly from logits rather than probabilities avoids some redundant computation and/or improves numerical accuracy.

PyTorch takes a similar approach (see CrossEntropyLoss and NLLLoss), so this isn't unique to optax or flax.

View full answer

carlosgmartin · 2025-04-13T20:30:06Z

carlosgmartin
Apr 13, 2025

Computing cross-entropy directly from logits rather than probabilities avoids some redundant computation and/or improves numerical accuracy.

PyTorch takes a similar approach (see CrossEntropyLoss and NLLLoss), so this isn't unique to optax or flax.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why is there no normal categorical cross-entropy loss? #4668

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Why is there no normal categorical cross-entropy loss? #4668

Uh oh!

yasharhon Mar 31, 2025

Replies: 1 comment

Uh oh!

Uh oh!

carlosgmartin Apr 13, 2025

yasharhon
Mar 31, 2025

carlosgmartin
Apr 13, 2025