随机森林Python Sklearn

时间:2017-12-23 02:20:31

标签: random-forest cross-validation

我正在运行随机森林并有一个问题:

下面是我的代码:

For IX = 1 to IXLastrow
    For IY = 1 to IXLastRow
        If Range(“D” & IY).Value Like “*” & Range(“B” & IX).Value & “*” Then 
            Range(“C” & IX).Value = Range(“B” & IX).Value & Left(Range(“D” & IY).Value,2)
            Exit For
        End If
    Next IY
Next IX

我运行此代码5-10,每次输出几乎与+ -3相同并且为了解决这个问题(每次手动运行),我们可以使用CV函数。

但是,如果我使用cross_val_score函数,结果会有很大不同(我运行此函数10次,结果总是只变化+ -5。:

import sklearn as sk
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score, learning_curve
from sklearn.model_selection import KFold

housing = fetch_california_housing()
data_X = housing["data"]
data_Y = housing["target"]
data_names = housing["feature_names"]
train_X, test_X,train_Y , test_Y = train_test_split(data_X,data_Y, train_size = 0.7, test_size = .3)
RF = RandomForestRegressor()
RF.fit(train_X,train_Y)
print("Train R2 : Cofficient of Determination", RF.score(train_X, train_Y))
print("Test R2 : Cofficient of Determination", RF.score(test_X, test_Y))

===output====
Train R2 : Cofficient of Determination 0.960256248602
Test R2 : Cofficient of Determination 0.780840418591
=============
它出来了 ===输出===

准确度:0.62

我的问题是,为什么手动运行或使用cross_val_score有很多变化,以及什么是正确的方法。

0 个答案:

没有答案