下面列出了我的代码,可以在这里找到培训数据和测试数据:
但是,当我将预测保存到文件中并与测试数据的结果进行比较时,输出的精度为0.55。在380条记录中,只有72条结果被正确分类。那么我怎么得出准确度= 0.55?
import seaborn as sns
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
from keras import optimizers
def one_hot_encode_object_array(arr):
'''One hot encode a numpy array of objects (e.g. strings)'''
uniques, ids = np.unique(arr, return_inverse=True)
return np_utils.to_categorical(ids, len(uniques))
fields = ['dataResult','HomeWin','Draw','AwayWin']
traindata =pd.read_csv('17-18.csv', usecols=fields)
train_X = traindata.values[:, 1:4]
train_Y = traindata.values[:, 0]
train_y_ohe = one_hot_encode_object_array(train_Y)
testdata =pd.read_csv('16-17.csv', usecols=fields)
test_X = testdata.values[:, 1:4]
test_Y = testdata.values[:, 0]
test_y_ohe = one_hot_encode_object_array(test_Y)
model = Sequential()
model.add(Dense(16, input_shape=(3,)))
model.add(Activation('relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(3))
model.add(Activation('softmax'))
model.compile(optimizers.Adam(lr=0.001), loss='categorical_crossentropy', metrics=["accuracy"])
model.fit(train_X, train_y_ohe, epochs=100, batch_size=1, verbose=1, validation_data=(test_X, test_y_ohe))
loss, accuracy = model.evaluate(test_X, test_y_ohe, verbose=1)
print("Accuracy = {:.2f}".format(accuracy))
prediction = model.predict(test_X)
print(prediction)
np.savetxt('prediction.csv',prediction ,delimiter=',')
更新:
原来是我的愚蠢,我的领域是fields = ['dataResult','HomeWin','Draw','AwayWin']
当我给train_Y提供一个热编码函数时,我以为[1,0,0]表示HomeWin,[0,1,0]是Draw,[0,0,1]是awayWin。
结果是[0,0,1]是主场胜利,[1,0,0]是主场胜利。 有谁知道将一种热编码转换回标签的好方法?
答案 0 :(得分:1)
Keras模型的分类准确性可计算出与真实值(而不是平均值)在相同位置出现预测的频率:
categorical_accuracy = K.mean(K.equal(K.argmax(y_true, axis=-1), K.argmax(y_pred, axis=-1)))
进一步咨询here
答案 1 :(得分:0)
您的代码中有错误。您正在使用火车进行测试!!!
更改
test_X = traindata.values[:, 1:4]
test_Y = traindata.values[:, 0]
使用
test_X = testdata.values[:, 1:4]
test_Y = testdata.values[:, 0]