Question

我从https://stevenloria.com/tf-idf/那里获得了这一行代码

scores = {word: tfidf(word, blob, bloblist) for word in blob.words}

tfidf函数在哪里：

import math
from textblob import TextBlob as tb

def tf(word, blob):
    return blob.words.count(word) / len(blob.words)

def n_containing(word, bloblist):
    return sum(1 for blob in bloblist if word in blob.words)

def idf(word, bloblist):
    return math.log(len(bloblist) / (1 + n_containing(word, bloblist)))

def tfidf(word, blob, bloblist):
    return tf(word, blob) * idf(word, bloblist)

为了更好地理解该过程，我想将速记循环转换为规则外观的“ for”循环。

这是正确的吗？

scores = []
for word in blob.words:
    scores.append(tfidf(word, blob, bloblist))

还有，编写循环的简短形式有什么好处？

Answer 1

这是正确的吗？

该代码将给您一个列表，但是您的原始代码会产生一个字典。相反，请尝试：

scores = {}
for word in blob.words:
    scores[word] = tfidf(word, blob, bloblist)

还有，编写循环的简短形式有什么好处？

列表推导（和字典推导等）的主要优点是它们比其长格式等效项短。

Answer 2

这些被称为comprehensions（示例是最常见的情况，例如列表理解，尽管这是字典理解）。

它们通常更简单，更易读，但是如果您发现它们令人困惑，则以与您所做的方式写出来完全相同，只是有所不同

scores = {}
for word in blob.words:
    scores[word] = (tfidf(word, blob, bloblist))

将Python的简短形式循环为长形式

2 个答案: