sklearn RandomForest模型的32/64位序列化异常的解决方法

时间:2016-08-30 23:14:23

标签: python scikit-learn pickle random-forest joblib

如果我们在64位计算机上使用joblib序列化randomforest模型,然后在32位计算机上解压缩,则会有例外:

ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'long long'

之前已经问过这个问题:Scikits-Learn RandomForrest trained on 64bit python wont open on 32bit python。但这个问题自2014年以来一直未得到回答。

学习模型的示例代码(在64位计算机上):

modelPath="../"
featureVec=...
labelVec = ...
forest = RandomForestClassifier()
randomSearch = RandomizedSearchCV(forest, param_distributions=param_dict, cv=10, scoring='accuracy',
                                      n_iter=100, refit=True)
randomSearch.fit(X=featureVec, y=labelVec)
model = randomSearch.best_estimator_
joblib.dump(model, modelPath)

在32位计算机上解压缩的示例代码:

modelPath="../"
model = joblib.load(modelPkl) # ValueError thrown here

我的问题是:如果我们必须在64位计算机上学习并将其移植到32位计算机进行预测,是否存在针对此问题的通用解决方法?

编辑: 试图直接使用pickle而不是joblib。仍然存在相同的错误。核心pickle库中发生错误(对于joblib和pickle):

  File "/usr/lib/python2.7/pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "/usr/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/usr/lib/python2.7/pickle.py", line 1133, in load_reduce
    value = func(*args)
  File "sklearn/tree/_tree.pyx", line 585, in sklearn.tree._tree.Tree.__cinit__ (sklearn/tree/_tree.c:7286)
ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'long long'

0 个答案:

没有答案