我使用Keras建立了用于文本分类的LSTM模型。现在我有新数据要训练。我想到了使用模型权重训练数据,而不是附加到原始数据并重新训练模型。即进行权重训练以使用新数据。 但是,无论我训练的数量如何,该模型都无法预测正确的分类(即使我给出相同的句子进行预测)。可能是什么原因? 请帮助我。
答案 0 :(得分:0)
您是否使用以下内容保存经过训练的模型?
from keras.models import load_model
model = load_model('model.h5') # Load the architecture
model = model.load_weights('model_weights.h5') # Set the weights
# train on new data
model.compile...
model.fit...
然后加载以下内容?
#split into individual docs
text.s = strsplit(text, "\n(?=#\\*)", perl = T)[[1]]
# function to extract information from individual docs
extract_info = function(x, patterns = list(title="^*#\\*",
autors="^*#@",
year="^*#t",
revue="^*#c",
id_paper="^*#index",
id_ref="^*#%",
abstract="^*#!")) {
lapply(patterns, function(p) {
extract = grep(p, x, value = T)
# here you check the length of the potential output
# and modify the type according to your needs
if (length(extract) > 1) {
extract = list(extract)
} else if (length(extract) == 0) {
extract = NA
}
return(extract)
})
}
# apply the function to the data
# and rbind it into a data.frame
do.call(rbind,
lapply(text.s, function(x) {
x = strsplit(x, "\\n")[[1]]
extract_info(x)
})
)
# title autors year revue id_paper id_ref
# [1,] "#*TeX: The Program" "#@Donald E. Knuth" "#t1986" "#c" "#index68" NA
# [2,] "#*Foundations of Databases." "#@Serge Abiteboul,Richard Hull,Victor Vianu" "#t1995" "#c" "#index69" List,1
# abstract
# [1,] NA
# [2,] "#!From the Book: This book will teach you how to write specifications of computer systems, using th" [truncated]
加载的模型与此处保存的模型完全相同。如果您这样做的话,那么数据中肯定会有一些不同的东西(与经过训练的数据相比)。