在过去的几周中,我一直在练习一些机器学习算法和数据科学技术,但遇到了一个无法解决的问题。我已经搜索了其他类似的问题,但没有找到。
我在UCI数据集中使用了以下带有银行票据身份验证的内核,但我不明白为什么我在精度,召回率和F1分数上得到相同的分数。我在做不应做的事情吗?
import os
import numpy as np
import pandas as pd
bank_df = pd.read_csv("../input/bank-note-authentication-uci-data/BankNote_Authentication.csv")
bank_df.head()
bank_df.info() # No null values.
bank_df.describe() # Already scaled because the dataset was built
# using Wavelets.
# Splitting 'bank_df' in attributes and label.
y_train = bank_df["class"]
X_train = bank_df.drop("class", axis=1)
from sklearn.linear_model import SGDClassifier
sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(X_train, y_train)
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import precision_score, recall_score, f1_score
y_train_predictions = cross_val_predict(sgd_clf, X_train, y_train, cv=3)
print("Precision: {}".format(precision_score(y_train, y_train_predictions)))
print("Recall: {}".format(recall_score(y_train, y_train_predictions)))
print("F1 Score: {}".format(f1_score(y_train, y_train_predictions)))
// Result for all three: 0.9836065573770492