关于重写类中函数的最紧凑,最简洁的方法?

时间:2018-11-09 14:41:00

标签: python python-3.x class inheritance override

我正在尝试编写一个用于文本操作的类。这个想法是类将支持基本的文本预处理,但是如果有人想编写一个非常复杂的预处理功能,则他们应该能够使用基类并覆盖它。我尝试了以下方法,即使我可以通过某种方式使其起作用,但我认为这不是正确的方法。

class TextPreprocessor:
    def __init__(self, corpus):
        """Text Preprocessor base class.

            corpus: a list of sentences

        """
        self.corpus      = corpus
        self.word_tokens = [self.preprocess(sentence) for sentence in corpus]

    def preprocess(self,sentence):
        """
        strip each sentence , lowercase it and split by space # sentence.strip().lower().split()

        """

        return sentence.strip().lower().split()

    def preprocess_transform(self,sentence):

        return self.preprocess(sentence)

现在,如果我想编写一个新的预处理功能,这是最好的方法。我尝试关注,

class SubPreprocess(TextPreprocessor):
    def __init__(self, corpus):
        #### dummy preprocess function
        def preprocess(self, sentence):
            return sentence.strip().split() + ['HELLOOOOOOOOOOLLLL']
        super.__init__(corpus)

它不起作用。我基本上想要的是预处理功能(经过修改),应该能够覆盖基类TextPreprocessor中的那个,以便在调用__init__时,self.word_tokens应该基于在新的预处理功能上

4 个答案:

答案 0 :(得分:5)

将执行以下操作:

class SubPreprocess(TextPreprocessor):
    def preprocess(self, sentence):
        return sentence.strip().split() + ['HELLOOOOOOOOOOLLLL']

如果您现在调用SubPreprocess的构造函数,则将使用新的preprocess方法:

proc = SubPreprocess(some_corpus)  
# looks up any methods in the mro of SubPreprocess

答案 1 :(得分:2)

class SubPreprocess(TextPreprocessor):
    def __init__(self, corpus):
        #this is how you initialise the superclass
        super(SubPreprocess, self).__init__(corpus)

    # the overridden function should be within the scope of the class, not under the initializer
    def preprocess(self, sentence):
        return sentence.strip().split() + ['HELLOOOOOOOOOOLLLL']

答案 2 :(得分:0)

如果要注入行为,只需使用一个函数:

class TheAlgorithm:
  def __init__(self, preprocess):
     self.preprocess = preprocess
  def process(self, corpus):
     after_a = self.part_a(corpus)
     preprocessed = self.preprocess(after_a)
     return self.part_b(preprocessed)

使用非常简单:

p = TheAlgorithm(lambda c: c.strip().split() + 'helllol')
p.process('the corpus')

实际上,如果您的课程仅存储一些函数,则可以进行全功能编程:

def processor(preprocess):
   def algorithm(corpus):
      return part_b( preprocess(corpus) )

p = processor(lambda c: "-".join(c.split(",")))
assert "a-b-c" == p("a,b,c")

答案 3 :(得分:0)

尝试更改:super。初始化(语料库) 改为super()。初始化(语料库)