Open
Description
can you explain that do we use the teacher forcing in the validation phase EVEN with no parmater update in order to compute the cross entropy loss like we do in training to measure model performance in terms of overfitting,underfitting,generalization not used the autoregressive technique in the validation set? autoregressive is ONLY used in test phase not in validation?
THANKS
Metadata
Metadata
Assignees
Labels
No labels