所有估算器都应该实现拟合和变换

时间:2017-11-13 08:18:50

标签: python

我是机器学习的新手。请帮我解决这个错误。下面是我的代码: 我正在尝试创建一个自定义类catEncoder()来转换我的分类变量。

class DFSelector(BaseEstimator, TransformerMixin):
    def __init__(self, Attr):
        self.Attr = Attr
    def fit(self, X, y = None):
        return self
    def transform(self, X):
        print(self.Attr)
        return X[self.Attr].values

class catEncoder(BaseEstimator, TransformerMixin):
    def __init__(self):
        pass
    def fit(self, X, y = None):
        return self
    def transform(self, X):
        #Some-codes to encode variables
        return X.values

numPipeline = [
                ('selector', DFSelector(numAttr)),
                ('imputer', Imputer(strategy = 'median'))
]
catPipeline = [
                ('selector', DFSelector(catAttr)),
                ('encoder', catEncoder())
]
fullPipe = FeatureUnion(transformer_list = [
                                                ('nPipe', numPipeline),
                                                ('cPipe', catPipeline)
])
Xtrain_ready = fullPipe.fit_transform(Xtrain)

我收到以下错误

TypeError: All estimators should implement fit and transform. '[('selector', DFSelector(Attr=array(['SibSp', 'Parch', 'Fare'], dtype=object))), ('imputer', Imputer(axis=0, copy=True, missing_values='NaN', strategy='median', verbose=0))]' (type <class 'list'>) doesn't

1 个答案:

答案 0 :(得分:0)

查看文档,问题似乎是FeatureUnion期望表单(字符串,变换器)的元组列表,其中变换器是变换器对象。但是,在您的情况下,它看起来像是传递表单的元组(字符串,[多个变换器])。我的建议是,如果没有全面了解你的ML逻辑,就是尝试将这些变换器列表分解为单独的变换器并将它们传递给FeatureUnion