随机森林算法的精度为0.0

时间:2019-03-30 03:40:31

标签: algorithm machine-learning scikit-learn jupyter-notebook random-forest

我正在使用Jupyter笔记本进行机器学习项目。我在GridSearchCV上使用随机森林,执行正常,但我的准确度= 0.0

当我尝试使用决策树时,准确度= 99.99

我该如何解决这个问题?

输入

#Training the RandomForest Algorithm


from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification

rfc=RandomForestClassifier(random_state=42)

param_grid = { 
    'n_estimators':  [50, 100, 200],
     'max_depth' : [5, 10, 20],
    'min_samples_leaf': [1, 2, 3, 4, 5, 10, 20]
 }
CV_rfc = GridSearchCV(estimator=rfc, param_grid=param_grid, cv= 5)
CV_rfc.fit(X_train, y_train)
CV_rfc.best_params_
rfc1=RandomForestClassifier(random_state=42,  n_estimators= 50, max_depth=5, criterion='gini')
rfc1.fit(X_train, y_train)

哪个给出输出:

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=5, max_features='auto', max_leaf_nodes=None,
            min_impurity_split=1e-07, min_samples_leaf=1,
            min_samples_split=2, min_weight_fraction_leaf=0.0,
            n_estimators=50, n_jobs=1, oob_score=False, random_state=42,
            verbose=0, warm_start=False)

输入:

pred=rfc1.predict(X_test)

print("Accuracy for Random Forest on CV data: ",accuracy_score(y_test,pred))

输出:

基于CV数据的随机森林的准确性:0.0

输入:

'''
Compute confusion matrix and print classification report.
'''
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score


# score the model
Ntest    = len(y_test)
Ntestpos = len([val for val in y_test if val])
NullAcc  = float(Ntest-Ntestpos)/Ntest
print("Mean accuracy on Training set: %s" %rfc1.score(X_train, y_train))
print("Mean accuracy on Test set:     %s" %rfc1.score(X_test, y_test))
print("Null accuracy on Test set:     %s" %NullAcc)
print(" ")
y_pred = rfc1.predict(X_test)
f1_score(y_test, y_pred, average='weighted')
y_true, y_pred = y_test, rfc1.predict(X_test)
cm             = confusion_matrix(y_true, y_pred)
print("Confusion matrix:\ntn=%6d  fp=%6d\nfn=%6d  tp=%6d" %(cm[0][0],cm[0][1],cm[1][0],cm[1][1]))
print("\nDetailed classification report: \n%s" %classification_report(y_true, y_pred))

输出:

  

训练集的平均准确度:1.0

     

测试集的平均准确度:0.0

     

测试集的空精度为:0.0

有错误     UndefinedMetricWarning:F得分定义不明确,在没有预测样本的标签中被设置为0.0。       'precision','predicted',平均值,warn_for)     UndefinedMetricWarning:F得分定义不明确,在没有预测样本的标签中被设置为0.0。       'precision','predicted',average,warn_for)

Confusion matrix:
tn=     0  fp=     0
fn=1745395  tp=     0

Detailed classification report: 
             precision    recall  f1-score   support

          0       0.00      0.00      0.00         0
          1       0.00      0.00      0.00   1745395
          2       0.00      0.00      0.00    143264
          3       0.00      0.00      0.00     75044
          4       0.00      0.00      0.00     46700
          5       0.00      0.00      0.00     31568
          6       0.00      0.00      0.00     22966
          7       0.00      0.00      0.00     16903
          8       0.00      0.00      0.00     13188
          9       0.00      0.00      0.00     10160
                 .
                 .
                 .
        119       0.00      0.00      0.00         2
        123       0.00      0.00      0.00         2
        124       0.00      0.00      0.00         1
        141       0.00      0.00      0.00         1
        165       0.00      0.00      0.00         1

avg / total       0.00      0.00      0.00   2148603

0 个答案:

没有答案