我试图编写一个函数来查找字符串中最长运行的从零开始的索引。如果有多个具有相同长度的运行,则代码应返回第一个的索引。
a=["a","b","b","c","c","c","d","d","d","d","c","c","c","b","b","a"]
def longestrun(myList):
result = None
prev = None
size = 0
max_size = 0
for i in myList:
if i == prev:
print (i)
size += 1
if size > max_size:
print ('******* '+ str(max_size))
max_size = size
else:
size = 0
prev = i
print (max_size+1)
return max_size+1
longestrun(a)
我做了一些研究,发现这个代码我认为可以用来找到我列表中最长的一段时间,但我不知道如何使用它来查找最长运行中第一个字母的索引。任何人都可以帮助我或给我一些如何做到这一点的建议吗?总的来说,程序运行时的输出应该产生数字6作为第一个' d'在索引6处,是最长的运行。
请注意我是初学者,所以如果答案尽可能简单并且解释清楚,我们将不胜感激。
答案 0 :(得分:2)
这应该没问题:
def longestrun(myList):
prev = None
size = 0
max_size = 0
curr_pos = 0
max_pos = 0
for (index, i) in enumerate(myList):
if i == prev:
size += 1
if size > max_size:
max_size = size
max_pos = curr_pos
else:
size = 0
curr_pos = index
prev = i
return max_pos
答案 1 :(得分:1)
如果您想要最长字符串的起始索引:
from operator import itemgetter
def longest(l):
od = defaultdict(int)
prev = None
out = []
for ind, ele in enumerate(l):
if ele != prev and prev in od:
out.append((ind, prev, od[prev]))
od[prev] = 0
od[ele] += 1
prev = ele
best = max(out, key=itemgetter(2)) # max by sequence length
return best[0] - best[2] # deduct last index from length to get start
print(longest(a))
我存储了所有的密钥和长度,以防您真正了解所有信息。
没有进口:
def longest1(l):
prev = None
seq = 0
best = 0
indx = None
for ind, ele in enumerate(l):
if ele != prev: # if we have a new char we have a new sequence
# if current seq len is greater than our current best
if seq > best:
# update best to current len and set index to start of the sequence
best = seq
indx = ind - seq
seq = 0 # reset seq count
seq += 1
prev = ele
return indx
print(longest(a))
有些时间表明简单的循环实际上是最有效的:
In [23]: timeit longestrun_index(a)
100000 loops, best of 3: 9.07 µs per loop
In [24]: timeit longestrun(a)
100000 loops, best of 3: 2.54 µs per loop
In [25]: timeit longest(a)
100000 loops, best of 3: 6.79 µs per loop
In [26]: timeit longest1(a)
100000 loops, best of 3: 3.06 µs per loop
答案 2 :(得分:1)
您可以将itertools.groupby()
与max()
和enumerate()
一起用于此:
from itertools import groupby
from operator import itemgetter
def longestrun_index(seq):
groups = ((next(g), sum(1 for _ in g)+1) for k, g in groupby(enumerate(seq),
key=itemgetter(1)))
(index, item), length = max(groups, key=itemgetter(1))
return index
a = ["a","b","b","c","c","c","d","d","d","d","c","c","c","b","b","a"]
print (longestrun_index(a))
# 6
这是如何运作的?
itertools.groupby
和enumerate(a)
制作相似项目组。但是,由于enumerate(a)
将从列表a
返回索引以及项目((索引,项目)元组),我们需要告诉groupby
使用项目对内容进行分组,我在operator.itemgetter(1)
中使用了groupby()
。现在groupby()
返回两个项目,我们用于分组的项目关键项目以及迭代器形式的组。现在我们可以通过调用迭代器上的next
来使用此迭代器(组)来获取第一个项目以及索引,然后使用sum()
获取此组中存在的所有项目的总计数生成器表达式:sum(1 for _ in g)+1
。我们之前使用next()
来补偿我们已从该群组中提取的项目。
使用索引,键和计数我们现在有了生成器,它将在迭代时产生((index, key), length)
。
现在我们可以再次使用带有itemgetter的内置函数max()
来指定要用于比较的项目(length
此处)并找到所需的索引。
答案 3 :(得分:0)
您可以使用itertools.groupby
获取运行列表,然后您只需找到最大运行并总计所有先前运行的长度:
from itertools import groupby
a = ["a","b","b","c","c","c","d","d","d","d","c","c","c","b","b","a"]
# Get list of runs, each in the form (character, length)
runs = [(x, len(list(y))) for x,y in groupby(a)]
# Identify longest run
maxrun = max(runs, key=lambda x: x[1])
# Sum length of all runs before the max
index = 0
for run in runs:
if run == maxrun: break
index += run[1]
print(index)
答案 4 :(得分:-1)
使用defaultdict创建一个包含每个项目计数的字典,然后找到具有最高值的键,然后找到该项目的第一个匹配项。
from collections import defaultdict
import operator
letters=["a","b","b","c","c","c","d","d","d","d","c","c","c","b","b","a"]
d = defaultdict(int)
for letter in letters:
d[letter] += 1
highest_run = max(d.iteritems(), key=operator.itemgetter(1))[0]
z_index =''.join(letters).find(highest_run)
print z_index
使用模块的好处是简化和开发效率;再加上维护良好且经过良好测试的代码,“站在巨人的肩膀上”的效果。这并不是说在使用模块检查它们是否维护良好并且进行单元测试时你不应该小心。