我正在使用sklearn category_report报告测试统计信息。该方法给出的准确度为42%,而模型评估给出的准确度为93%。真正的准确度是哪一个?造成这种差异的原因是什么?
模型评估:
results = model.evaluate(test_ds.values, test_lb.values)
print(results)
输出:
7397/7397 [==============================] - 0s 28us/sample - loss: 0.2309 - acc: 0.9305
报告分类:
import numpy as np
from sklearn.metrics import classification_report
predictions = model.predict(test_ds)
print(classification_report(test_lb, np.argmax(predictions, axis=1)))
输出:
label precision recall f1-score support
0 0.41 0.38 0.40 3700
1 0.43 0.46 0.44 3697
accuracy 0.42 7397
答案 0 :(得分:2)
理想情况下,两个指标应给出相同级别的准确性,但有一些细微的差异。问题可能出在数据上。
您可以看到以下示例来比较两个指标。
import tensorflow as tf
from sklearn.datasets import load_iris
import numpy as np
from tensorflow import keras
from sklearn.model_selection import train_test_split
iris = load_iris()
X = iris.data[:, (2, 3)] # petal length, petal width
y = (iris.target == 0).astype(np.int)
(X_train,X_test,y_train,y_test) = train_test_split(X,y,test_size=0.2)
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[2]),
keras.layers.Dense(300, kernel_initializer="he_normal"),
keras.layers.LeakyReLU(),
keras.layers.Dense(100, kernel_initializer="he_normal"),
keras.layers.LeakyReLU(),
keras.layers.Dense(1, activation="sigmoid")
])
model.compile(loss="binary_crossentropy",
optimizer=keras.optimizers.SGD(),
metrics=["accuracy"])
model.fit(X_train,y_train,epochs=2)
培训准确性:
Epoch 1/2
4/4 [==============================] - 0s 3ms/step - loss: 2.0655 - accuracy: 0.6333
Epoch 2/2
4/4 [==============================] - 0s 3ms/step - loss: 0.5199 - accuracy: 0.7333
<tensorflow.python.keras.callbacks.History at 0x7fdd4ed72048>
评估结果:
test_ds = pd.DataFrame(X_test)
test_lb = pd.DataFrame(y_test)
model.evaluate(test_ds.values,test_lb.values)
1/1 [==============================] - 0s 1ms/step - loss: 0.5510 - accuracy: 0.6667
[0.5510352253913879, 0.6666666865348816]
使用Sklearn指标:
import numpy as np
from sklearn.metrics import classification_report
predictions = model.predict(X_test)
print(classification_report(y_test, np.argmax(predictions, axis=1)))
precision recall f1-score support
0 0.67 1.00 0.80 20
1 0.00 0.00 0.00 10
accuracy 0.67 30
macro avg 0.33 0.50 0.40 30
weighted avg 0.44 0.67 0.53 30
您可以看到两个指标的准确性相同(66.7和67)。