Question

我有一个包含许多字符串的列表。有些字符串是重复的，所以我想计算一下它们重复了多少次。对于单数字符串，我只打印它，对于重复的字符串，我想打印它的重复次数。代码如下：

hello appeared 2

但是，由于它正在打印重复字符串的所有实例，因此存在一些问题。例如，如果列表中有两个“ hello”字符串，它将打印scan 't1', { TIMERANGE => [0, 1416083300000] }两次。那么，有没有一种方法可以跳过检查重复字符串的所有实例？感谢帮助。

Answer 1

循环中的

list.count很昂贵。它将解析每个单词的整个列表。这就是O（n ²）的复杂度。您可以遍历一组单词，但这是O（m * n）复杂性，仍然不怎么好。

相反，您可以使用collections.Counter来解析列表一次。然后迭代您的字典键值对。这将具有O（m + n）的复杂度。

lst = ['hello', 'test', 'this', 'is', 'a', 'test', 'hope', 'this', 'works']

from collections import Counter

c = Counter(lst)

for word, count in c.items():
    if count == 1:
        print(word)
    else:
        print(f'{word} appeared: {count}')

hello
test appeared: 2
this appeared: 2
is
a
hope
works

Answer 2

使用set

例如：

for string in set(list):
    if list.count(string) > 1:
        print(string+" appeared: ")
        print(list.count(string))
    elif list.count(string) == 1:
        print(string)

Answer 3

使用Counter

要创建：

In [166]: import collections

In [169]: d = collections.Counter(['hello', 'world', 'hello'])

显示：

In [170]: for word, freq in d.items():
     ...:     if freq > 1:
     ...:         print('{0} appeared {1} times'.format(word, freq))
     ...:     else:
     ...:         print(word)
     ...:
hello appeared 2 times
world

Answer 4

您可以像这样使用python的collections.counter-

import collections
result = dict(collections.Counter(list))

另一种手动执行此操作的方法是：

result = {k, 0 for k in set(list)}
for item in list:
    result[item] += 1

此外，您不应将列表命名为list作为其python的内置类型。现在这两种方法都会给你像-

{"a": 3, "b": 1, "c": 4, "d": 1}

keys是列表中的唯一值，值是密钥在列表中出现的次数

计算列表中的重复字符串并打印

4 个答案: