Question

我有以下代码可以正常工作，我想知道如何使用列表理解来实现相同的逻辑。

def get_features(document, feature_space):
    features = {}
    for w in feature_space:
        features[w] = (w in document)
    return features

我是否可以通过使用列表推导来改善性能？

问题是feature_space和document都相对较大，并且会运行很多迭代。

修改：很抱歉，一开始就没有说清楚，feature_space和document都是列表。

Answer 1

像这样，使用 dict comprehension ：

def get_features(document, feature_space):
    return {w: (w in document) for w in feature_space}

features[key] = value表达式在开始时变为key: value部分，其余for循环和任何if语句遵循嵌套顺序。< / p>

是的，这会为您带来性能提升，因为您现在已经删除了所有features本地名称查询和dict.__setitem__来电。

请注意，您需要确保document是具有快速成员资格测试的数据结构。如果它是一个列表，例如，首先将其转换为set()，以确保成员资格测试需要O（1）（常量）时间，而不是列表的O（n）线性时间：

def get_features(document, feature_space):
    document = set(document)
    return {w: (w in document) for w in feature_space}

使用set，现在是O（K）循环而不是O（KN）循环（其中N是document的大小，K大小为{ {1}}。