我训练了一个模型,该模型显示所有4个预测并以json格式保存。当我尝试加载它并进行预测时,它仅显示一个预测。可能会发生什么?
我的代码:
test = pd.read_csv('./Data/test.tsv', sep="\t")
from nltk.tokenize import word_tokenize
from nltk import FreqDist
from nltk.stem import SnowballStemmer,WordNetLemmatizer
stemmer=SnowballStemmer('english')
lemma=WordNetLemmatizer()
from string import punctuation
import re
testing = test.Phrase.apply(lambda x: x.lower())
tokenizer = Tokenizer(num_words= 10000)
X_test = tokenizer.texts_to_sequences(testing.values)
X_test = sequence.pad_sequences(X_test, maxlen=48)
json_file = open('model1.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# Load weights into new model
loaded_model.load_weights('model1.h5')
loaded_model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['accuracy'])
prediction = model.predict_classes(X_test,verbose=1)
model.summary()#while training
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, None, 100) 1373200
_________________________________________________________________
lstm_1 (LSTM) (None, None, 64) 42240
_________________________________________________________________
lstm_2 (LSTM) (None, 32) 12416
_________________________________________________________________
dense_1 (Dense) (None, 5) 165
=================================================================
Total params: 1,428,021
Trainable params: 1,428,021
Non-trainable params: 0
print(X_test.shape)
(66292, 48)
答案 0 :(得分:0)
如果我正确理解了您的问题,则问题出在这里:
predict_classes
将为您返回最终的预测标签,而不是概率。它将返回概率最高的四个标签之一。如果您需要每个类别的概率,则可能应该使用predict_proba
或predict
,它们是相同的,例如:
prediction = model.predict(X_test,verbose=1)
答案 1 :(得分:0)
错误已解决,无法正确调整测试值。在以下命令中删除错误
tokenizer.fit_on_texts(testing.values)