Open
Description
Based on the results you showed in gits, it looks like losses from training and valid is diverging instead of converging as number of epoch increasing. So, what's the problem here? I saw other text stigmatization model, they build their own dictionary and input something like one-hot encoding vector for each word. Does this process make difference? Thank you!
I'm glad if you want to talk more by WeChat (ID: XHansonX)
Metadata
Metadata
Assignees
Labels
No labels