Python:没有。字符串中每个字符的出现次数

时间:2012-10-01 13:24:06

标签: python string collections

  

可能重复:
  how to get the number of occurrences of each character using python

获取字符串中每个字符的计数并存储它的最佳方法是什么(我正在使用字典 - 这个选择能产生很大的不同吗?)?我想到了几种方式:

1

for character in string:
    if character in characterCountsDict:
        characterCountsDict[character] += 1
    else:
        characterCountsDict[character] = 1

2

character = 0
while character < 127:
    characterCountsDict[str(unichr(character))] = string.count(str(unichr(character))
    character += 1

我认为第二种方法更好...... 但他们中的任何一个都不错吗? 还有更好的方法吗?

2 个答案:

答案 0 :(得分:10)

>>> from collections import Counter
>>> Counter("asdasdff")
Counter({'a': 2, 's': 2, 'd': 2, 'f': 2})

请注意,您可以像使用dict一样使用Counter对象。

答案 1 :(得分:2)

如果您对最有效的方式感兴趣,它似乎是这样的:

from collections import defaultdict

def count_chars(s):
    res = defaultdict(int)
    for char in s:
        res[char] += 1
    return res

时序:

from collections import Counter, defaultdict

def test_counter(s):
    return Counter(s)

def test_get(s):
    res = {}
    for char in s:
        res[char] = res.get(char, 0) + 1
    return res

def test_in(s):
    res = {}
    for char in s:
        if char in res:
            res[char] += 1
        else:
            res[char] = 1
    return res

def test_defaultdict(s):
    res = defaultdict(int)
    for char in s:
        res[char] += 1
    return res


s = open('/usr/share/dict/words').read()
#eof

import timeit

test = lambda f: timeit.timeit(f + '(s)', setup, number=10)
setup = open(__file__).read().split("#eof")[0]
results = ['%.4f %s' % (test(f), f) for f in dir() if f.startswith('test_')]
print  '\n'.join(sorted(results))

结果:

0.8053 test_defaultdict
1.3628 test_in
1.6773 test_get
2.3877 test_counter