我想在字符串中找到满足某个条件的字母索引。如果字母前面的所有括号都完整,我想找到字母g的索引。
这就是我所拥有的
sen = 'abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'
这就是我所做的
lst = [(i.end()) for i in re.finditer('g', sen)]
# lst
# [7, 16, 20, 29, 32, 36, 40]
count_open = 0
count_close = 0
for i in lst:
sent=sen[0:i]
for w in sent:
if w == '(':
count_open += 1
if w == ')':
count_close += 1
if count_open == count_close && count_open != 0:
c = i-1
break
它给了我c作为39,这是最后一个索引,但正确的答案应该是35作为第二个最后一个g之前的括号。
答案 0 :(得分:3)
你可以省去regex
,只需使用一个堆栈来跟踪你的parens是否在你迭代角色时是否平衡:
In [4]: def find_balanced_gs(sen):
...: stack = []
...: for i, c in enumerate(sen):
...: if c == "(":
...: stack.append(c)
...: elif c == ")":
...: stack.pop()
...: elif c == 'g':
...: if len(stack) == 0:
...: yield i
...:
In [5]: list(find_balanced_gs(sen))
Out[5]: [31, 35, 39]
在这里使用堆栈是"经典"检查平衡的parans的方式。自从我从头开始实施它已经有一段时间了,所以可能会有一些我没有考虑过的边缘情况。但这应该是一个好的开始。我已经创建了一个生成器,但你可以使它成为一个正常的函数,它返回一个索引列表,第一个索引或最后一个索引。
答案 1 :(得分:1)
保持你的想法,只有一些事情没有,请看评论:
import re
sen='abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'
lst=[ (i.end()) for i in re.finditer('g', sen)]
#lst
#[7, 16, 20, 29, 32, 36, 40]
for i in lst:
# You have to reset the count for every i
count_open= 0
count_close=0
sent=sen[0:i]
for w in sent:
if w=='(':
count_open+=1
if w==')':
count_close+=1
# And iterate over all of sent before comparing the counts
if count_open == count_close & count_open != 0:
c=i-1
break
print(c)
# 31 - actually the right answer, not 35
但是这不是很有效,因为你在字符串的相同部分上多次迭代。你可以使它更高效,只在字符串上迭代一次:
sen='abcd(fgji(l)jkpg((jgsdti))khgy)ghyig(a)gh'
def find(letter, string):
count_open = 0
count_close = 0
for (index, char) in enumerate(sen):
if char == '(':
count_open += 1
elif char == ')':
count_close += 1
elif char == letter and count_close == count_open and count_open > 0:
return index
else:
raise ValueError('letter not found')
find('g', sen)
# 31
find('a', sen)
# ...
# ValueError: letter not found
答案 2 :(得分:1)
out = [] # collect all valid 'g'
ocount = 0 # only store the difference between open and closed
for m in re.finditer('[\(\)g]', sen): # use re to preselect
L = m.group()
ocount += {'(':1, ')':-1, 'g':0}[L] # save a bit of typing
assert ocount >= 0 # enforce some grammar if you like
if L == 'g' and ocount == 0:
out.append(m.start())
out
# [31, 35, 39]
答案 3 :(得分:1)
这是在OP中更简单地采用代码(并考虑条件count_open != 0
):
def get_idx(f, sen):
idx = []
count_open= 0
count_close=0
for i, w in enumerate(sen):
if w == '(':
count_open += 1
if w == ')':
count_close += 1
if count_open == count_close & count_open != 0:
if w == f:
idx.append(i)
return idx
get_idx('g', sen)
输出:
[31, 35, 39]
答案 4 :(得分:-1)
您可以使用.index()查找字符串或列表中字符串或元素的索引。
将stringvar.index(string)放入,这将为您提供字符串的偏移量或索引。