我有两个函数只有一行不同,所以为了避免代码重复,我想创建一个具有这些函数的一般形式的基类,然后为每个类继承它。
功能1:
def top_similar_traces(self, stack_trace, top=10):
words_to_test = StackTraceProcessor.preprocess(stack_trace)
words_to_test_clean = [w for w in np.unique(words_to_test).tolist() if w in model]
# Cos-similarity
all_distances = np.array(1.0 - np.dot(model.wv.syn0norm, model.wv.syn0norm[
[model.wv.vocab[word].index for word in words_to_test_clean]].transpose()), dtype=np.double)
for i, (doc_id, rwmd_distance) in enumerate(distances):
doc_words_clean = [w for w in self.corpus[doc_id] if w in model]
wmd = self.wmdistance(model, words_to_test_clean, doc_words_clean, all_distances)
return sorted(similarities, key=lambda v: v[1])[:top]
功能2:
def top_similar_traces(self, stack_trace, top=10):
words_to_test = StackTraceProcessor.preprocess(stack_trace)
words_to_test_clean = [w for w in np.unique(words_to_test).tolist() if w in model]
# Cos-similarity
all_distances = np.array(1.0 - np.dot(model.wv.syn0norm, model.wv.syn0norm[
[model.wv.vocab[word].index for word in words_to_test_clean]].transpose()), dtype=np.double)
for i, (doc_id, rwmd_distance) in enumerate(distances):
doc_words_clean = [w for w in self.corpus[doc_id].words if w in model]
wmd = self.wmdistance(model, words_to_test_clean, doc_words_clean, all_distances)
return sorted(similarities, key=lambda v: v[1])[:top]
你可以看到唯一的区别在于
doc_words_clean = [w for w in self.corpus[doc_id].words if w in model]
doc_words_clean = [w for w in self.corpus[doc_id] if w in model]
答案 0 :(得分:1)
您可以在超类中定义函数,如:
def top_similar_traces(self, stack_trace, t, top=10):
words_to_test = StackTraceProcessor.preprocess(stack_trace)
words_to_test_clean = [w for w in np.unique(words_to_test).tolist() if w in model]
# Cos-similarity
all_distances = np.array(1.0 - np.dot(model.wv.syn0norm, model.wv.syn0norm[
[model.wv.vocab[word].index for word in words_to_test_clean]].transpose()), dtype=np.double)
for i, (doc_id, rwmd_distance) in enumerate(distances):
if t=="something":
doc_words_clean = [w for w in self.corpus[doc_id] if w in model]
else:
doc_words_clean = [w for w in self.corpus[doc_id].words if w in model]
wmd = self.wmdistance(model, words_to_test_clean, doc_words_clean, all_distances)
return sorted(similarities, key=lambda v: v[1])[:top]
其中t
是一个做出所需决定的字符串,然后你应该从你的子类中调用这个方法,如:
def top_similar_traces(self, stack_trace, top=10):
return super().top_similar_traces(stack_trace, "option", top)
这样的解决方案应该有效。 t
可以是任何类型的变量(整数,字符串等)
答案 1 :(得分:1)
只需将更改部分提取到单独的方法中即可。这样,基类可以覆盖该部分并影响原始方法,而不必复制整个代码。
这样的事情:
# Base class
def top_similar_traces(self, stack_trace, top=10):
words_to_test = StackTraceProcessor.preprocess(stack_trace)
words_to_test_clean = [w for w in np.unique(words_to_test).tolist() if w in model]
# Cos-similarity
all_distances = np.array(1.0 - np.dot(model.wv.syn0norm, model.wv.syn0norm[
[model.wv.vocab[word].index for word in words_to_test_clean]].transpose()), dtype=np.double)
for i, (doc_id, rwmd_distance) in enumerate(distances):
# call another method here
doc_words_clean = self.top_similar_traces_filter_words(doc_id)
wmd = self.wmdistance(model, words_to_test_clean, doc_words_clean, all_distances)
return sorted(similarities, key=lambda v: v[1])[:top]
# Subclass A
def top_similar_traces_filter_words(self, doc_id):
return [w for w in self.corpus[doc_id].words if w in model]
# Subclass B
def top_similar_traces_filter_words(self, doc_id):
return [w for w in self.corpus[doc_id] if w in model]
顺便说一下。我不知道你的model
来自哪里,但它似乎是一个全局变量。您应该避免这种情况,而是将其放入您的班级(或将其传入)。
答案 2 :(得分:1)
你提到" ...我想用这些函数的一般形式创建一个基类,然后为每个类继承它。"
我想指出,没有必要为此创建一个类。使用单个函数可以正常工作。在以下示例中,我添加了第四个名为words
的参数,并将值设置为True
。如果将其保留为True
,则该函数将使用您检查self.corpus[doc_id].words
的行。如果您使用False
调用该函数,它将使用您检查self.corpus[doc_id]
的行。
def top_similar_traces(self, stack_trace, top=10, words=True):
words_to_test = StackTraceProcessor.preprocess(stack_trace)
words_to_test_clean = [w for w in np.unique(words_to_test).tolist() if w in model]
# Cos-similarity
all_distances = np.array(1.0 - np.dot(model.wv.syn0norm, model.wv.syn0norm[[model.wv.vocab[word].index for word in words_to_test_clean]].transpose()), dtype=np.double)
for i, (doc_id, rwmd_distance) in enumerate(distances):
if words == True:
doc_words_clean = [w for w in self.corpus[doc_id].words if w in model]
else:
doc_words_clean = [w for w in self.corpus[doc_id] if w in model]
wmd = self.wmdistance(model, words_to_test_clean, doc_words_clean, all_distances)
return sorted(similarities, key=lambda v: v[1])[:top]
要使用该函数检查self.corpus [doc_id] .words,请按以下方式调用:
top_similar_traces(<stack_trace>)
要使用该函数检查self.corpus [doc_id],请按以下方式调用:
top_similar_traces(<stack_trace>, words=False)