我遇到与unpickle sklearn.tree.DescisionTreeRegressor in python 2 from python3类似的问题,但stackoverflow.com/questions/1385096/上的链接解决方案似乎对我不起作用。
我在尝试使用python2加载时遇到了类似的问题 sklearn我在python3中保存的RandomForestClassifier。
我把它归结为核心问题,即加载一个结构化的numpy数组,保存它以表示树中的节点,这导致
ValueError: non-string names in Numpy dtype unpickling
。
我创建了一个MWE。
在python 3中
import numpy as np
import pickle
data = np.array(
[( 1, 26, 69, 5.32214928e+00, 0.69562945, 563, 908., 1),
( 2, 7, 62, 1.74883020e+00, 0.33854101, 483, 780., 1),
(-1, -1, -2, -2.00000000e+00, 0.76420451, 7, 9., -2),
(-1, -1, -2, -2.00000000e+00, 0. , 62, 106., -2)],
dtype=[('left_child', '<i8'), ('right_child', '<i8'),
('feature', '<i8'), ('threshold', '<f8'), ('impurity',
'<f8'), ('n_node_samples', '<i8'),
('weighted_n_node_samples', '<f8'), ('missing_direction',
'<i8')])
# Save using pickle
with open('data.pkl', 'wb') as file_:
# Use protocol 2 to support python2 and 3
pickle.dump(data, file_, protocol=2)
# Save with numpy directly
np.save('data.npy', data)
然后在python 2中
# Load with pickle
import pickle
with open('data.pkl', 'rb') as file_:
data = pickle.load(file_)
# This results in `ValueError: non-string names in Numpy dtype unpickling`
# Load with numpy directly
data = np.load('data.npy')
# This works, but wont help me load a pickled sklearn.ensemble.RandomForestClassifier
然而,这仍然不会让sklearn在2到3之间发挥出色。 那么,我们怎样才能让pickle正确加载这个numpy对象呢? 以下是链接中建议的修复:
from lib2to3.fixes.fix_imports import MAPPING
import sys
import pickle
# MAPPING maps Python 2 names to Python 3 names. We want this in reverse.
REVERSE_MAPPING = {}
for key, val in MAPPING.items():
REVERSE_MAPPING[val] = key
# We can override the Unpickler and loads
class Python_3_Unpickler(pickle.Unpickler):
def find_class(self, module, name):
if module in REVERSE_MAPPING:
module = REVERSE_MAPPING[module]
__import__(module)
mod = sys.modules[module]
klass = getattr(mod, name)
return klass
with open('data.pkl', 'rb') as file_:
data = Python_3_Unpickler(file_).load()
这仍然不起作用