我编写了这个Python程序来计算Python字符串中每个字符的数量。
def count_chars(s):
counts = [0] * 65536
for c in s:
counts[ord(c)] += 1
return counts
def print_counts(counts):
for i, n in enumerate(counts):
if n > 0:
print(chr(i), '-', n)
if __name__ == '__main__':
print_counts(count_chars('hello, world \u2615'))
输出:
- 2
, - 1
d - 1
e - 1
h - 1
l - 3
o - 2
r - 1
w - 1
☕ - 1
这个程序可以计算任何Unicode字符的出现次数吗?如果没有,可以采取哪些措施来确保每个可能的Unicode字符都得到处理?
答案 0 :(得分:7)
您的代码只处理Basic Multilingual Plane中的字符;例如,emoticons无法处理。您可以通过使用字典而不是具有固定数量索引的列表来解决这个问题,并将字符用作键。
但是,您应该使用collections.Counter()
object:
from collections import Counter
counts = Counter(s)
for character, count in counts.most_common():
print(character, '-', count)
毕竟,它仅适用于此类用例。
演示:
>>> from collections import Counter
>>> s = 'hello, world \u2615 \U0001F60A'
>>> counts = Counter(s)
>>> for character, count in counts.most_common():
... print(character, '-', count)
...
- 3
l - 3
o - 2
r - 1
w - 1
e - 1
h - 1
d - 1
☕ - 1
, - 1
- 1
答案 1 :(得分:0)
message='alpha beta gamma sudama'
z = list(message)
p = []
for x in range (0,len(z)):
y=0
i=0
count=0
if z[x] not in p:
p.append(z[x])
while i < len(z) :
if z[x] == z[i]:
count = count+1
i = i+1
print(z[x],count)