ULMFit Fast AI语言模型准确性值低

时间:2020-10-28 22:50:45

标签: python deep-learning nlp lstm fast-ai

数据集“ review_data”包含Tripadvisor的评论和客户评分。我正在建立一个ULMFit语言模型来预测“审阅”中的文本序列。

以下数据框。

    Review  Rating
0   nice hotel expensive parking got good deal sta...   4
1   ok nothing special charge diamond member hilto...   2
2   nice rooms not 4* experience hotel monaco seat...   3
3   unique, great stay, wonderful time hotel monac...   5
4   great stay great stay, went seahawk game aweso...   5


review_data.shape
(20491, 2)

review_data = review_data[['Rating', 'Review']]
#Split into train and val data
df_trn, df_val = train_test_split(review_data, stratify = review_data['Rating'], test_size = 0.2, random_state = 12)

print(df_trn.shape, df_val.shape)
(16392, 2) (4099, 2)


# Language model data
data_lm = TextLMDataBunch.from_df(train_df = df_trn, valid_df = df_val, path = "")

#Building the language model 
learn = language_model_learner(data_lm,arch = AWD_LSTM,  drop_mult=0.3)

我正在使用最佳学习率来训练语言模型;

learn.lr_find()
learn.recorder.plot(suggestion = True)
min_grad_lr = learn.recorder.min_grad_lr

learn.fit_one_cycle(3,min_grad_lr)

epoch   train_loss  valid_loss  accuracy    time
0   5.987871    5.842046    0.163807    02:13
1   5.675927    5.674133    0.173539    02:13
2   5.304801    5.632321    0.176002    02:13

训练后准确性很低 在对模型进行微调后,精度并没有提高

learn.unfreeze()
learn.fit_one_cycle(5, 1e-3)

epoch   train_loss  valid_loss  accuracy    time
0   5.180523    5.562634    0.181098    02:39
1   5.121043    5.504951    0.185985    02:39
2   4.919733    5.491002    0.187887    02:39
3   4.678843    5.540877    0.187085    02:39
4   4.506824    5.582721    0.184676    02:39

这可能是什么原因?有什么方法可以提高准确性?

0 个答案:

没有答案