NLL损失爆炸的原因

时间:2020-10-27 06:30:34

标签: pytorch transformer

我一直在尝试基于Transformer的语言模型,因为实现了损失函数Negative Log-likelihood。出于某种原因,经过几次迭代后,重建损失急剧增加。 这是训练程序生成的日志

root - WARNING - Loss: 203.81146240234375
root - WARNING - Loss: 124.32596588134766
root - WARNING - Loss: 62.59440612792969
root - WARNING - Loss: 59.84109115600586
root - WARNING - Loss: 59.247005462646484
root - WARNING - Loss: 48.832725524902344
root - WARNING - Loss: 57.592288970947266
root - WARNING - Loss: 50.18443298339844
root - WARNING - Loss: 46.474849700927734
root - WARNING - Loss: 52.12908172607422
root - WARNING - Loss: 50.090736389160156
root - WARNING - Loss: 66.04253387451172
root - WARNING - Loss: 49.094024658203125
root - WARNING - Loss: 36.69044494628906
root - WARNING - Loss: 48.54591369628906
root - WARNING - Loss: 60.71137237548828
root - WARNING - Loss: 40.35478591918945
root - WARNING - Loss: 49.070556640625
root - WARNING - Loss: 54.33742141723633
root - WARNING - Loss: 47.14014434814453
root - WARNING - Loss: 55.043060302734375
root - WARNING - Loss: 47.63726043701172
root - WARNING - Loss: 46.314571380615234
root - WARNING - Loss: 41.330291748046875
root - WARNING - Loss: 48.85242462158203
root - WARNING - Loss: 50.59345245361328
root - WARNING - Loss: 48.508975982666016
root - WARNING - Loss: 43.35681915283203
root - WARNING - Loss: 45.875431060791016
root - WARNING - Loss: 51.701438903808594
root - WARNING - Loss: 39.1783561706543
root - WARNING - Loss: 30.14274024963379
root - WARNING - Loss: 44.33928680419922
root - WARNING - Loss: 40.88005447387695
root - WARNING - Loss: 62.682804107666016
root - WARNING - Loss: 45.18329620361328
root - WARNING - Loss: 39.7137451171875
root - WARNING - Loss: 47.31813049316406
root - WARNING - Loss: 50.755348205566406
root - WARNING - Loss: 40.52918243408203
root - WARNING - Loss: 49.48160934448242
root - WARNING - Loss: 58.29778289794922
root - WARNING - Loss: 45.660675048828125
root - WARNING - Loss: 55.13115692138672
root - WARNING - Loss: 50.72150421142578
root - WARNING - Loss: 33.377098083496094
root - WARNING - Loss: 48.404151916503906
root - WARNING - Loss: 60.24494934082031
root - WARNING - Loss: 46.290470123291016
root - WARNING - Loss: 9.493173539216099e+24

在最后一次迭代的权重更新之后,从那时起,损失变为nan。发生这种情况的可能原因是什么?

0 个答案:

没有答案