我的Python自定义变换器不工作

时间:2017-09-28 14:33:30

标签: python scikit-learn

我用Python编写了这个自定义变换器。目的是在Pipeline类中使用它来对数据预处理步骤进行排序。我的数据集有9个数字,第10个列是分类的。

from sklearn.base import BaseEstimator, TransformerMixin

class DataFrameSelector(BaseEstimator, TransformerMixin):
    def _init_(self, attribute_names):
       self.attribute_names = attribute_names
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        return X[self.attribute_names].values

在我尝试运行这段代码时定义此类后,我收到错误列在下面

FYI .... datasets_num是仅包含数字列/属性的数据帧。

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
num_attributes = list(datasets_num)
cat_attributes = ["ocean_proximity"]

num_pipeline = Pipeline([
        ('selector', DataFrameSelector(num_attributes)),
        ('imputer', Imputer(strategy = "median")),
        ('std_scalar', StandardScaler()) 
        ])

cat_pipeline = Pipeline([
       ('selector', DataFrameSelector(cat_attributes)),
       ('label_binarizer', LabelBinarizer())
       ])

错误:

Traceback (most recent call last):

  File "<ipython-input-34-f509d02ccc6e>", line 7, in <module>
     ('selector', DataFrameSelector(num_attributes)),

 TypeError: object() takes no parameters

1 个答案:

答案 0 :(得分:1)

下面:

class DataFrameSelector(BaseEstimator, TransformerMixin):
    def _init_(self, attribute_names):

你想要双下划线:

    def __init__(self, attribute_names):