使用joblib在x86上转储scikit学习模型,然后在决策树中的z / OS通道上读取,但在GradientBoostingRegressor

时间:2019-04-05 19:43:49

标签: python python-3.x scikit-learn x86 zos

我对numpy_pickle.py中的NumpyArrayWrapper进行了一些小的调整,以使决策树模型可以成功地加载到在z / OS上运行的scikit-learn中。更改归结为检查字节顺序是否正确以及是否不正确调用array.byteswap()。但是,当尝试加载GradientBoostingRegressor模型时,它甚至无法达到byteswap修复程序而失败。

错误来自此行https://github.com/scikit-learn/scikit-learn/blob/0.18.1/sklearn/tree/_tree.pyx#L644,这是由于以下条件node_ndarray.dtype != NODE_DTYPE引起的。发生这种情况的原因是,当Boosting Regressor不会https://github.com/scikit-learn/scikit-learn/blob/0.18.1/sklearn/externals/joblib/numpy_pickle.py#L105

时,Gradient Boost Regressor会命中以下代码

我想知道是否有人应该做些不同的事情,因为在z / OS上加载时,DT模型的Dtypes看起来不错,但是GBR模型却没有。这似乎来自model.fit方法,因为删除该调用时,我可以成功将pkl文件加载到z / OS上。

用于训练梯度提升模型的代码

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.pipeline import Pipeline
from sklearn import datasets
from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)

gbr = GradientBoostingRegressor(max_depth=3)
model = Pipeline([('Gbr', gbr)])

model.fit(X_train, y_train)


from sklearn.externals import joblib
joblib.dump(model, 'GBTmodelx86.pkl')

用于训练决策树模型的代码

from sklearn import datasets
from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier

dataset = datasets.load_iris()

model = DecisionTreeClassifier()
model.fit(dataset.data, dataset.target)

from sklearn.externals import joblib
joblib.dump(model, 'DTmodelX86.pkl')

用于加载每个模型的代码

from sklearn.externals import joblib
model = joblib.load('DTmodelX86.pkl') 
from sklearn.externals import joblib
model = joblib.load('GBTmodelx86.pkl') 

0 个答案:

没有答案