Question

我建立了一个自定义的sklearn管道，如下所示：

pipeline = make_pipeline(
    SelectColumnsTransfomer(features_to_use),
    ToDummiesTransformer('feature_0', prefix='feat_0', drop_first=True,  dtype=bool), # Dummify customer_type
    ToDummiesTransformer('feature_1', prefix='feat_1'), # Dummify the feature
    ToDummiesTransformer('feature_2', prefix='feat_2'), # Dummify 
    ToDummiesTransformer('feature_3', prefix='feat_3'), # Dummify
)
pipeline.fit(df)

类SelectColumnsTransfomer和ToDummiesTransformer是实现BaseEstimator和TransformerMixin的自定义sklearn步骤。要序列化该对象，我使用

from sklearn.externals import joblib
joblib.dump(pipeline, 'data_pipeline.joblib')

但是当我对它进行反序列化

pipeline = joblib.load('data_pipeline.joblib')

我得到AttributeError: module '__main__' has no attribute 'SelectColumnsTransfomer'。

我阅读了其他类似的问题，并按照此博客文章here中的说明进行操作，但无法解决该问题。我正在复制粘贴类，并将其导入代码中。如果我创建此练习的简化版本，那么整个事情就可以了，因为我正在使用pytest运行一些测试，所以会出现问题，而当我运行pytest时，似乎看不到我的自定义类，实际上还有另一部分错误的 self = <sklearn.externals.joblib.numpy_pickle.NumpyUnpickler object at 0x7f821508a588>, module = '__main__', name = 'SelectColumnsTransfomer'提示我NumpyUnpickler即使在测试中也看不到SelectColumnsTransfomer。

我的测试代码

import pytest
from app.pipeline import * # the pipeline objects 
                          # SelectColumnsTransfomer and ToDummiesTransformer 
                          # are here!


@pytest.fixture(scope="module")
def clf():
    pipeline = joblib.load("persistence/data_pipeline.joblib")
    return clf

def test_fake(clf):
    assert True

Answer 1

好的，我发现了问题所在。我发现问题与我最初想到的博客Python: pickling and dealing with "AttributeError: 'module' object has no attribute 'Thing'"中解释的问题无关。您可以通过对对象进行酸洗和取消酸洗来轻松解决问题。我正在使用一个单独的脚本（Jupyther笔记本）进行腌制，并使用一个普通的[python脚本进行分解]。当我在同一堂课上做完所有事情时，它就会起作用。

pickle / joblib AttributeError：模块'main'在pytest中没有属性'thing'

1 个答案:

pickle / joblib AttributeError：模块'__main__'在pytest中没有属性'thing'

1 个答案:

pickle / joblib AttributeError：模块'main'在pytest中没有属性'thing'