Python:调用继承的父类方法失败

时间:2019-10-12 06:22:26

标签: python pandas oop scikit-learn

我围绕sklearn中的一个现有类创建了一个传递包装类,它的行为与预期不符:

import pandas as pd
from sklearn.preprocessing import OrdinalEncoder

tiny_df = pd.DataFrame({'x': ['a', 'b']})

class Foo(OrdinalEncoder):

    def __init__(self, *args, **kwargs):
        super().__init__(self, *args, **kwargs)

    def fit(self, X, y=None):
        super().fit(X, y)
        return self


oe = OrdinalEncoder()
oe.fit(tiny_df) # works fine
foo = Foo()
foo.fit(tiny_df) # fails

我收到的错误消息的相关部分是:

~\.conda\envs\pytorch\lib\site-packages\sklearn\preprocessing\_encoders.py in _fit(self, X, handle_unknown)
     69                         raise ValueError("Unsorted categories are not "
     70                                          "supported for numerical categories")
---> 71             if len(self._categories) != n_features:
     72                 raise ValueError("Shape mismatch: if n_values is an array,"
     73                                  " it has to be of shape (n_features,).")

TypeError: object of type 'Foo' has no len()

尽管我已经在类的_categories方法中调用了父级构造函数,但似乎仍未设置父级的私有属性__init__()。我必须在这里缺少一些简单的东西,希望能对您有所帮助!

1 个答案:

答案 0 :(得分:2)

您不必再次将self传递给super函数。并且scikit-learn的估算器应始终在其__init__的签名中指定其参数,并且不允许使用varargs,否则您将获得RUNTIMEERROR,因此必须将其删除。我已经修改了您的代码,如下所示:

import pandas as pd
from sklearn.preprocessing import OrdinalEncoder

tiny_df = pd.DataFrame({'x': ['a', 'b']})

class Foo(OrdinalEncoder):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def fit(self, X, y=None):
        super().fit(X, y)
        return self


oe = OrdinalEncoder()
oe.fit(tiny_df) # works fine
foo = Foo()
foo.fit(tiny_df) # works fine too
  

样品输出

foo.transform(tiny_df)
array([[0.],
       [1.]])
  

有点额外

class Foo(OrdinalEncoder):

    def __init__(self, *args, **kwargs):
        super().__init__(*args,**kwargs)

    def fit(self, X, y=None):
        super().fit(X, y)
        return self

创建Foo时:

foo= Foo()

RuntimeError: scikit-learn estimators should always specify their parameters in the signature of their __init__ (no varargs). <class '__main__.Foo'> with constructor (self, *args, **kwargs) doesn't  follow this convention.

希望有帮助!