Question

我确实使用一天的CSV文件创建了模型。现在，在收到新数据的第二天，我想训练我的相同模型而不丢失以前的模型值。我有分类数据，其编码值在培训中存储到.npy文件中，并且在测试中正在加载相同的编码文件。每小时都有新的分类数据到达。现在如何合并两个模型？

培训方面：

编码

y = data[:,-1].values
x = data.iloc [:,0:11].values
from sklearn.preprocessing import LabelEncoder
labelencoder_x_0 = LabelEncoder()
labelencoder_x_0.fit(x[:, 0])
x[:, 0] = labelencoder_x_0.transform(x[:, 0])
np.save('x0.npy', labelencoder_x_0.classes_)

load_previous_model

json_file = open(filename, 'r')
loaded_model_json = json_file.read()
json_file.close()
model1 = model_from_json(loaded_model_json)
model1_name = 'model1.h5'
model1.load_weights(model1_name)
print("Loaded model1 from disk")

为新数据创建新模型

seed =2016
np.random.seed (seed)
model2 = Sequential ()
model2.add (LSTM (  30 , activation = 'tanh', inner_activation = 'hard_sigmoid' ,return_sequences=True, input_shape =(len(cols), 1) ))
model2.add(Dropout(0.2))
model2.add(LSTM(30))
model2.add(Dropout(0.2))
model2.add (Dense (output_dim =1, activation = 'linear'))
model2.compile (loss ="mean_squared_error" , optimizer = "adam", metrics=['accuracy'])
model2_json = model2.to_json()
with open("model2.json", "w") as json_file:
    json_file.write(model2_json)
model2.save_weights("model2.h5")
print(">>>> Model2 saved to model2.h5 in the disk")

现在我有两个模型{model1和model2}，该如何合并？我看到了一些答案，但听不懂。

newModel = Model([model1.input,model2.input], mergedOut???)
newmodel.fit (x, y, batch_size =20, nb_epoch =15, shuffle = False)

那新的分类数据呢？我可以将两个文件分类数据合并或附加到一个文件中吗？

Answer 1

根据您的问题，我认为您主要关心的不是神经网络，因为它使用LSTM。因此，可以采用任何长度的任何值。但是您需要更新（合并）LabelEncoder类，因为您还希望对新值进行编码，而无需再次拟合所有数据。

只有存储的数据LabelEncoder是classes_；因此，我们可以轻松地通过操作classes_属性来合并两个LabelEncoder。检查以下代码：

import numpy as np
from sklearn import preprocessing


le1 = preprocessing.LabelEncoder()
le1.fit(["paris", "paris", "tokyo", "amsterdam"])
print(le1.transform(['paris']))
print(le1.classes_)

le2 = preprocessing.LabelEncoder()
le2.fit(["munich", "istanbul", "new york"])
print(le2.transform(['istanbul']))
print(le2.classes_)


def combine_label_encoder(le1, le2):
    le_new = preprocessing.LabelEncoder()
    le_new.classes_ = np.hstack((le1.classes_, le2.classes_))
    return le_new


le_combined = combine_label_encoder(le1, le2)
combined_cities = ['munich', 'tokyo', 'istanbul', 'paris']
print(le_combined.transform(combined_cities))
print(le_combined.classes_)

输出：

[1]
['amsterdam' 'paris' 'tokyo']
[0]
['istanbul' 'munich' 'new york']
[4 6 1 6]
['amsterdam' 'paris' 'tokyo' 'istanbul' 'munich' 'new york']

还要检查sklearn doc

Python：合并两个模型以创建包含分类数据的新模型

1 个答案: