我想使用python的病态学习来计算和打印许多指标(召回率,f分数,准确性)。
我正在做nlp,基本上y_pred和y_test是单词列表,我用pipe_vect.transform
对它们进行矢量化处理,然后使用sklearn.metrics来打印指标。
我的代码:
from sklearn.metrics import precision_recall_fscore_support as score
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.15, random_state=42)
x_train, x_valid, y_train, y_valid = train_test_split(x_train,y_train, test_size=0.1, random_state=42)
x_train = np.array(x_train)
y_train = np.array(y_train)
x_valid = np.array(x_valid)
y_valid = np.array(y_valid)
y_test = np.array(y_test)
x_test = np.array(x_test)
x_train = np.concatenate((x_train, x_valid))
y_train = np.concatenate((y_train, y_valid))
model = RandomForestClassifier(n_estimators=int(clf_params['n_estimators']),
max_features=clf_params['max_features'])
model.fit(pipe_vect.transform(x_train), y_train)
x_test_vect = pipe_vect.transform(x_test)
y_pred = model.predict_proba(x_test_vect)
#y_pred.shape # (417,1)
y_pred = y_pred.flatten()
y_pred.shape # (417,)
print('y_pred', y_pred)
print('y_pred dimension: ', y_pred.shape) #y_pred dimension: (417,)
print('y_test dimension: ', y_test.shape) #y_test dimension: (417,)
precision, recall, fscore, support = score(y_test, y_pred)
print('precision: {}'.format(precision))
print('recall: {}'.format(recall))
print('fscore: {}'.format(fscore))
print('support: {}'.format(support))
打印:
y_pred dimension: (417,)
y_test dimension: (417,)
precision: [0. 0.]
recall: [0. 0.]
fscore: [0. 0.]
support: [ 0. 417.]
我不明白为什么我的打印数是0。