我正在使用Jupyter笔记本进行机器学习项目。我在GridSearchCV
上使用随机森林,执行正常,但我的准确度= 0.0
当我尝试使用决策树时,准确度= 99.99
我该如何解决这个问题?
#Training the RandomForest Algorithm
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
rfc=RandomForestClassifier(random_state=42)
param_grid = {
'n_estimators': [50, 100, 200],
'max_depth' : [5, 10, 20],
'min_samples_leaf': [1, 2, 3, 4, 5, 10, 20]
}
CV_rfc = GridSearchCV(estimator=rfc, param_grid=param_grid, cv= 5)
CV_rfc.fit(X_train, y_train)
CV_rfc.best_params_
rfc1=RandomForestClassifier(random_state=42, n_estimators= 50, max_depth=5, criterion='gini')
rfc1.fit(X_train, y_train)
哪个给出输出:
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=5, max_features='auto', max_leaf_nodes=None,
min_impurity_split=1e-07, min_samples_leaf=1,
min_samples_split=2, min_weight_fraction_leaf=0.0,
n_estimators=50, n_jobs=1, oob_score=False, random_state=42,
verbose=0, warm_start=False)
pred=rfc1.predict(X_test)
print("Accuracy for Random Forest on CV data: ",accuracy_score(y_test,pred))
基于CV数据的随机森林的准确性:0.0
输入:
'''
Compute confusion matrix and print classification report.
'''
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
# score the model
Ntest = len(y_test)
Ntestpos = len([val for val in y_test if val])
NullAcc = float(Ntest-Ntestpos)/Ntest
print("Mean accuracy on Training set: %s" %rfc1.score(X_train, y_train))
print("Mean accuracy on Test set: %s" %rfc1.score(X_test, y_test))
print("Null accuracy on Test set: %s" %NullAcc)
print(" ")
y_pred = rfc1.predict(X_test)
f1_score(y_test, y_pred, average='weighted')
y_true, y_pred = y_test, rfc1.predict(X_test)
cm = confusion_matrix(y_true, y_pred)
print("Confusion matrix:\ntn=%6d fp=%6d\nfn=%6d tp=%6d" %(cm[0][0],cm[0][1],cm[1][0],cm[1][1]))
print("\nDetailed classification report: \n%s" %classification_report(y_true, y_pred))
训练集的平均准确度:1.0
测试集的平均准确度:0.0
测试集的空精度为:0.0
有错误 UndefinedMetricWarning:F得分定义不明确,在没有预测样本的标签中被设置为0.0。 'precision','predicted',平均值,warn_for) UndefinedMetricWarning:F得分定义不明确,在没有预测样本的标签中被设置为0.0。 'precision','predicted',average,warn_for)
Confusion matrix:
tn= 0 fp= 0
fn=1745395 tp= 0
Detailed classification report:
precision recall f1-score support
0 0.00 0.00 0.00 0
1 0.00 0.00 0.00 1745395
2 0.00 0.00 0.00 143264
3 0.00 0.00 0.00 75044
4 0.00 0.00 0.00 46700
5 0.00 0.00 0.00 31568
6 0.00 0.00 0.00 22966
7 0.00 0.00 0.00 16903
8 0.00 0.00 0.00 13188
9 0.00 0.00 0.00 10160
.
.
.
119 0.00 0.00 0.00 2
123 0.00 0.00 0.00 2
124 0.00 0.00 0.00 1
141 0.00 0.00 0.00 1
165 0.00 0.00 0.00 1
avg / total 0.00 0.00 0.00 2148603