计算列表中的元素会产生意外结果

时间:2016-08-12 10:43:06

标签: python list for-loop count

我正在尝试计算句子中每个字符的出现次数。我使用下面的代码:

printed = False
sentence = "the quick brown fox jumps over the lazy dog"
chars = list(sentence)
count = 0
for char in chars:
    if char == ' ':
        chars.remove(char)
    if printed == False:    
        count = chars.count(char)
        print "char count: ", char, count
    else:
        printed = False  

问题是,除了第一个单词之外,每个单词的第一个字母都没有打印,并且计数不正确(每次新单词开始时,计数从7开始递减1):

['t', 'h', 'e', ' ', 'q', 'u', 'i', 'c', 'k', ' ', 'b', 'r', 'o', 'w', 'n', ' ', 'f', 'o', 'x', ' ', 'j', 'u', 'm', 'p', 's', ' ', 'o', 'v', 'e', 'r', ' ', 't', 'h', 'e', ' ', 'l', 'a', 'z', 'y', ' ', 'd', 'o', 'g']
char count:  t 2
char count:  h 2
char count:  e 3
char count:    7
char count:  u 2
char count:  i 1
char count:  c 1
char count:  k 1
char count:    6
char count:  r 2
char count:  o 4
char count:  w 1
char count:  n 1
char count:    5
char count:  o 4
char count:  x 1
char count:    4
char count:  u 2
char count:  m 1
char count:  p 1
char count:  s 1
char count:    3
char count:  v 1
char count:  e 3
char count:  r 2
char count:    2
char count:  h 2
char count:  e 3
char count:    1
char count:  a 1
char count:  z 1
char count:  y 1
char count:    0
char count:  o 4
char count:  g 1

当我创建2个for循环而不是1时,它会更好用:

sentence = "the quick brown fox jumps over the lazy dog"
chars = list(sentence)
count = 0
for char in chars:
    if char == ' ':
        chars.remove(char)
print chars
printed = False

for char in chars:
    if printed == False:
        count = chars.count(char)
        print "char count: ", char, count
        printed = True
    else:
        printed = False

这是输出:

['t', 'h', 'e', 'q', 'u', 'i', 'c', 'k', 'b', 'r', 'o', 'w', 'n', 'f', 'o', 'x', 'j', 'u', 'm', 'p', 's', 'o', 'v', 'e', 'r', 't', 'h', 'e', 'l', 'a', 'z', 'y', 'd', 'o', 'g']
char count:  t 2
char count:  e 3
char count:  u 2
char count:  c 1
char count:  b 1
char count:  o 4
char count:  n 1
char count:  o 4
char count:  j 1
char count:  m 1
char count:  s 1
char count:  v 1
char count:  r 2
char count:  h 2
char count:  l 1
char count:  z 1
char count:  d 1
char count:  g 1

唯一的问题是,'o'字符出现在输出中两次......为什么会这样? 另外,为什么1循环不起作用?

3 个答案:

答案 0 :(得分:2)

迭代列表的副本并使用 elif ,或者在删除后只需继续,您不想要计算空格。

printed = False
sentence = "the quick brown fox jumps over the lazy dog"
chars = list(sentence)

for char in chars[:]:
    if char == ' ':
        chars.remove(char)
        continue
    if not printed:
        count = chars.count(char)
        print "char count: ", char, count
        printed = True
    else:
        printed = False

分割空格后你也可以str.join

for char in "".join(sentence.split()):
    if not printed:
        count = chars.count(char)
        print "char count: ", char, count
        printed = True
    else:
        printed = False

但是你自己的解决方案实际上都没有正确地工作,甚至没有复制列表,你输出中缺少字母:

char count:  t 2
char count:  e 3
char count:  u 2
char count:  c 1
char count:  b 1
char count:  o 4
char count:  n 1
char count:  o 4
char count:  j 1
char count:  m 1
char count:  s 1
char count:  v 1
char count:  r 2
char count:  h 2
char count:  l 1
char count:  z 1
char count:  d 1
char count:  g 1

字符串中有26个唯一的字母,但输出~17。

你需要的是跟踪所看到的字母并且只打印一次计数,你的代码不记录已打印的字符,只是随机设置一个标记:

sentence = "the quick brown fox jumps over the lazy dog"
chars = list(sentence)

printed = set()
for char in "".join(sentence.split()):
    if char not in printed:
        count = chars.count(char)
        print "char count: ", char, count
        printed.add(char)

或者,如果首次看到的订单无关紧要,那么只需在字符串上调用set:

for char in set("".join(sentence.split())):
    count = chars.count(char)
    print "char count: ", char, count

或者,如果您拥有大量数据,那么使用Counter dict

会更好
from collections import Counter
for char, count in Counter("".join(sentence.split())).items():
    print(char, count)

答案 1 :(得分:1)

您正在更改正在迭代的列表chars

l = range(10)

for i in l:
    l.remove(i)
    print i

给出:

0
2
4
6
8

你不应该修改你正在迭代的列表。只需跳过您不想处理的元素:

for char in chars:
    if char == ' ':
        continue

    if printed == False:    
        ...     

答案 2 :(得分:0)

出现问题的原因是你在循环运行时删除空格char,并且它会中断迭代过程。

当空间字符被包围时,根本不做任何事情。