Question

我的python代码中有以下函数：

def mk_standard_vectors(self, know):

    vector_type = 'k' if know else 'h'
    for name,se in self.sparse_entities.items():
      if name[-1] == vector_type:
        word = se.word
        for feature in (feature for c in se.contexts for feature in c.dlfs):
          self.vector_space.vectors[word][self.vector_space.contexts_to_id[feature]]+=1
          self.vector_space.vectors[feature][self.vector_space.contexts_to_id[word]]+=1

Codeclimate抱怨此函数中的代码重复。具体来说，关于两行阅读：

self.vector_space.vectors[word][self.vector_space.contexts_to_id[feature]]+=1
self.vector_space.vectors[feature][self.vector_space.contexts_to_id[word]]+=1

这两行更新了numpy数组中的特定值。基本上，我有一个带有向量字典的 vector_space 类，它通过字值对numpy数组进行索引。当我运行上述函数时，对于任何一对（单词，特征），我在 word 索引的数组中更新与 feature 对应的位置，然后相反。有没有聪明的方法来压缩代码并满足codeclimate？

（当然，我可以写一个额外的功能，但我不能看到它会使事情变得特别清洁或更有效。除非我遗漏了某些东西......）

Answer 1

在这种情况下，我会保持原样。你在那里做的很清楚。有时这些指导方针应该被忽略 - 并非总是如此，但如果有正当理由打破它们，你应该被允许打破它们。

但是如果你真的想避免警告，你可以把它包装成“子迭代”：

# ...
for feature in (feature for c in se.contexts for feature in c.dlfs):
    for vec, context in ((word, feature), (feature, word)):
        self.vector_space.vectors[vec][self.vector_space.contexts_to_id[context]] += 1

更新numpy数组中的值时修复代码重复

1 个答案: