我有一个文本文件列表,如下所示:
page_text_list = ['.............', '.............','name: bill','name: bob','address: 123 main st','name : tim','address: 124' ,'main st','name:', '.......']
如果我在字符串中找到一个“名称:”,我想提前阅读以获得该名称的地址。但是,如您所见,该模式是不一致的,并且并非总是可以假设下一行包含完整地址。
我想使用一个简单的循环遍历列表
for line in page_text_list:
但这似乎不足以完成这项工作。最好的方法是什么?
答案 0 :(得分:1)
假设您想获得name: ...
行之后直到下一个name: ...
行的所有行的列表,您可以这样做:
from itertools import dropwhile, takewhile
page_text_list = ['.............', '.............','name: bill','name: bob','address: 123 main st','name: tim','address: 124' ,'main st','name:', '.......']
def get_address(name):
# we drop all the lines who aren't 'name: bob'
it = dropwhile(lambda line: line != "name: " + name, page_text_list)
try:
next(it) # we drop the 'name: bob' line
except StopIteration: # if the name wasn't found, we exhausted the iterator
pass
# we return all the following lines, while they don't contain 'name:'
return list(takewhile(lambda line:"name:" not in line, it))
输出:
print(get_address('bill')) # no address
# []
print(get_address('dude')) # not in our list
# []
print('\n'.join(get_address('tim')))
# address: 124
# main st
答案 1 :(得分:0)
使用基于列表范围的范围迭代器,如下所示:
for index in range(len(page_text_list)):
if page_text_list[index].startswith('name'):
do_lookahead(page_text_list[index+1:])
def do_lookahead(list_rest):
for line in list_rest:
if line.startswith('address'):
return line