def count_squences(string):
i= 0
total = 0
total_char_list = []
while i < len(string):
print(string[i])
if string[i] == "x":
total += 1
if string[i] == "y":
total_char_list.append(total)
total = 0
i = i + 1
return total_char_list
print(count_squences("xxxxyyxyxx"))
我正在尝试以文件格式返回最常用的x个字符。例如,此函数应返回[4,1,2]。
例如,如果字符串为“ xxxxxyxxyxxx”,则应返回[5,2,3]
我的函数未返回正确的列表。任何帮助将不胜感激。谢谢
答案 0 :(得分:3)
遇到y
字符时,您不会重置计数器,并且只有在找到一个total_char_list
字符时,您才应该追加x
y
个字符(y
个字符也可以重复):
total = 0
while i < len(string):
if string[i] == "x":
total += 1
if string[i] == "y":
if total:
total_char_list.append(total)
total = 0
i = i + 1
接下来,当循环结束并且total
不为零时,您也需要附加该值,否则末尾将不计算'x'
个字符的顺序:
while ...:
# ...
if total:
# x characters at the end
total_char_list.append(total)
接下来,您真的想使用for
循环遍历序列。这样会给您单个字符:
total = 0
for char in string:
if char == 'x':
total += 1
if char == 'y':
if total:
total_charlist.append(total)
total = 0
if total:
# x characters at the end
total_char_list.append(total)
您可以使用itertools.groupby()
来加快速度:
from itertools import groupby
def count_squences(string):
return [sum(1 for _ in group) for char, group in groupby(string) if char == 'x']
groupby()
将可迭代的输入(例如字符串)划分为每个组单独的迭代器,其中,一个组定义为具有相同key(value)
结果的任何连续值。默认的key()
函数仅返回该值,因此groupby(string)
为您提供了相同的连续字符组。 char
是重复字符,sum(1 for _ in group)
占用迭代器的长度。
然后您可以使其更通用,并计算所有组:
def count_all_sequences(string):
counts = {}
for char, group in groupby(string):
counts.setdefault(char, []).append(sum(1 for _ in group))
return counts
使用正则表达式也可以做到这一点:
import re
def count_all_sequences(string):
counts = {}
# (.)(\1*) finds repeated characters; (.) matching one, \1 matching the same
# This gives us (first, rest) tuples, so len(rest) + 1 is the total length
for char, group in re.findall(r'(.)(\1*)', string):
counts.setdefault(char, []).append(len(group) + 1)
return counts
答案 1 :(得分:2)
您不必在序列之间初始化total
的值,因此它会不断计数。
def count_squences(string):
i= 0
total = 0
total_char_list = []
while i < len(string):
if string[i] == "x":
total += 1
if string[i] == "y":
if total != 0:
total_char_list.append(total)
total = 0
i = i + 1
if total != 0:
total_char_list.append(total)
return total_char_list
更新(17:00)-修复了原始程序,我想到了一个更好的解决方案-
my_str = "xxxxyyxyxx"
[len(z) for z in re.split("y+", my_str)]
答案 2 :(得分:-1)
针对功能格式进行了编辑:
def count_sequences(string):
return [len(x) for x in re.findall(r"x+", string)]
count_sequences("xxxxyyxyxx")
返回[4,1,2]