Question

我有以下方法，它接收字典列表并返回一个新列表，该列表仅包含具有唯一phrases的字典

@staticmethod
def remove_duplicate_phrases(words: List[Dict[str, Any]]):
    unique_phrases, unique_words = set(), []
    for word in words:
        if word['phrase'] not in unique_phrases:
            unique_phrases.add(word['phrase'])
            unique_words.append(word)
    return unique_words

有什么方法可以使其更快？

Answer 1

这是我通常选择的最干净的方法：

>>> list_ = [
    {"phrase": 1},
    {"phrase": 1},
    {"phrase": 2},
    {"phrase": None}
]

>>> list(set([dict_['phrase'] for dict_ in words]))
[1, 2, None]

上面的示例说明了如何清理字典列表，尽管性能不会显着提高。解决方案也取决于您要传递的单词数。

在需要无序唯一元素集合的情况下，

set()很有用。

运行此答案中的解决方案，然后与您进行比较。 2000次元素和3次导致此答案中的解决方案稍快一些。

# solution in answer
0.001382553018629551

# your solution
0.002490615996066481

如何从字典列表中删除重复的短语？

1 个答案: