Question

我目前正在使用Python 3，并希望从HDFS中加载pickle文件。

from pywebhdfs.webhdfs import PyWebHdfsClient
import pickle

hdfs = PyWebHdfsClient(host='...', user_name='...')
pickled_model = hdfs.read_file(pickle_path)
model = pickle.load(pickled_model)

TypeError: file must have 'read' and 'readline' attributes

我在pickle加载阶段遇到类型错误。我找到了一个使用pydoop打开文件的选项，然后取消选中它。但不幸的是，我无法使用Python 2.7。有类似的选择吗？

Answer 1

Per BluBb，pickle.load需要Python FileHandler。在这种情况下，hdfs.read_file返回字节，并使用pickle.loads正确读取模型。

从HDFS取消文件

1 个答案: