我有一个字符串列表(见下文)。我想通过查找两个特定的标记(开始和结束)来获取列表中的元素,然后保存这些标记之间存在的所有字符串。
例如,我在下面的列表中,并且想要获取在出现的字符串'RATED'
和'Like'
之间的所有字符串。这些子序列也可能多次出现。
['RATED',
' Awesome food at a good price .',
'Delivery was very quick even on New Year\xe2\x80\x99s Eve .',
'Please try crispy corn and veg noodles From this place .',
'Taste maintained .',
'Like',
'1',
'Comment',
'0',
'Share',
'Divyansh Agarwal',
'1 Review',
'Follow',
'3 days ago',
'RATED',
' I have tried schezwan noodles and the momos with kitkat shake',
"And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone",
'Like']
我尝试了其他方法,例如正则表达式,但没有一个解决问题。
答案 0 :(得分:2)
您可以使用正则表达式 首先,您需要使用一些不会出现在文本中的定界符来加入列表
delimiter = "#$#"
bigString = delimiter + delimiter.join(yourList) + delimiter
之后,您可以使用正则表达式
results = re.findall(r'#\$#RATED#\$#(.*?)#\$#Like#\$#', bigString)
现在,您只需要迭代所有结果并使用定界符分割字符串
for s in results:
print(s.split(delimiter))
答案 1 :(得分:2)
我建议您了解有关序列类型的索引查找和切片的信息:
示例:
def group_between(lst, start_token, end_token):
while lst:
try:
# find opening token
start_idx = lst.index(start_token) + 1
# find closing token
end_idx = lst.index(end_token, start_idx)
# output sublist
yield lst[start_idx:end_idx]
# continue with the remaining items
lst = lst[end_idx+1:]
except ValueError:
# begin or end not found, just skip the rest
break
l = ['RATED',' Awesome food at a good price .', 'Delivery was very quick even on New Year’s Eve .', 'Please try crispy corn and veg noodles From this place .', 'Taste maintained .', 'Like',
'1', 'Comment', '0', 'Share', 'Divyansh Agarwal', '1 Review', 'Follow', '3 days ago',
'RATED', ' I have tried schezwan noodles and the momos with kitkat shake', "And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone", 'Like'
]
for i in group_between(l, 'RATED', 'Like'):
print(i)
输出为:
[' Awesome food at a good price .', 'Delivery was very quick even on New Year’s Eve .', 'Please try crispy corn and veg noodles From this place .', 'Taste maintained .']
[' I have tried schezwan noodles and the momos with kitkat shake', "And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone"]
答案 2 :(得分:1)
您可以尝试例如
rec = False
result = []
for s in lst:
if s == 'Like':
rec = False
if rec:
result.append(s)
if s == 'RATED':
rec = True
结果
#[' Awesome food at a good price .',
# 'Delivery was very quick even on New Year’s Eve .',
# 'Please try crispy corn and veg noodles From this place .',
# 'Taste maintained .',
# ' I have tried schezwan noodles and the momos with kitkat shake',
# "And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone"]
答案 3 :(得分:1)
def find_between(old_list, first_word, last_word):
new_list = []
flag = False
for i in old_list:
if i is last_word:
break
if i is first_word:
flag = True
continue
if flag:
new_list.append(i)
return new_list
答案 4 :(得分:1)
使用正则表达式可以做到这一点。
a= ['RATED',' Awesome food at a good price .',
'Delivery was very quick even on New Year’s Eve .',
'Please try crispy corn and veg noodles From this place .',
'Taste maintained .', 'Like', '1', 'Comment', '0',
'Share', 'Divyansh Agarwal', '1 Review', 'Follow',
'3 days ago', 'RATED',
' I have tried schezwan noodles and the momos with kitkat shake', "And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone",
'Like']
import re
string = ' '.join(a)
b = re.compile(r'(?<=RATED).*?(?=Like)').findall(string)
print(b)
输出
[' Awesome food at a good price . Delivery was very quick even on New Year’s Eve . Please try crispy corn and veg noodles From this place . Taste maintained . ',
" I have tried schezwan noodles and the momos with kitkat shake And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone "]
答案 5 :(得分:1)
不带标志的选项:
new_list = []
group = [] # don’t need if the list starts with 'RATED'
for i in your_list:
if i == 'RATED':
group = []
elif i == 'Like':
new_list.append(group[:])
else:
group.append(i)
答案 6 :(得分:1)
for
循环:l = ['RATED',' Awesome food at a good price .', 'Delivery was very quick even on New Year’s Eve .', 'Please try crispy corn and veg noodles From this place .', 'Taste maintained .', 'Like',
'1', 'Comment', '0', 'Share', 'Divyansh Agarwal', '1 Review', 'Follow', '3 days ago',
'RATED', ' I have tried schezwan noodles and the momos with kitkat shake', "And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone", 'Like'
]
st, ed, aa = None, None, []
for k, v in enumerate(l):
if v == "RATED":
st = k
if v == "Like":
ed = k
if st != None and ed!= None:
aa.extend(l[st+1: ed])
st = None
ed = None
print (aa)
# [' Awesome food at a good price .', 'Delivery was very quick even on New Year’s Eve .', 'Please try crispy corn and veg noodles From this place .', 'Taste maintained .', ' I have tried schezwan noodles and the momos with kitkat shake', "And I would say just one word it's best for the best reasonable rates.... Gotta recommend it to everyone"]