-
My output of GPT2 model is:
which is different from that on page 120 of the book. The code I am using is the one from @rasbt
|
Beta Was this translation helpful? Give feedback.
Answered by
casinca
Apr 21, 2025
Replies: 1 comment
-
I believe it is to be expected, it's the same scheme as #607. The dummy model, just as GPT-2 used |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
Jessen-Li
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I believe it is to be expected, it's the same scheme as #607.
I have the same output as yours on Win and the same output from the notebook + p.120 book on Mac (Sebastian is on Mac).
The dummy model, just as GPT-2 used
nn.Dropout()
which explains the discrepancy that Sebastian and @d-kleine talked about in #607 earlier.