我有一个代码,用于打印包含在中的下一个单词 线,它很棒
但我还需要打印下一个单词再次包含上一个输出。
我尝试了编码,但没有产生预期的输出
qwer.txt
包含逐行:
/n i also have a appartment and./n
/nappartment good by the way/n
/nby good fine/n
/nfine is life/n
/nI have a bike/n
我的代码:
for i in cursor.fetchall():
keywords.append(i[0])
with open('qwer.txt','r') as file:
for line in file:
for key in keywords:
if key in line:
line = line.split(" ")
print line[line.index(key) + 1]
line1 = line[line.index(key) + 1]
if line1 in line:
print line
概念输出:在第一行:'公寓'它在第二行和第二行也有良好的'在第3行之后,它也有“精细”。接下来是第4行,但第4行在第5行没有任何单词。所以不应出现第5行。
答案 0 :(得分:0)
您可以tee
迭代,然后成对比较每一行以检查单词交叉点,例如:
from itertools import tee, izip
import re
def grouper(iterable):
fst, snd = tee(iterable)
next(snd, '')
for prev, curr in izip(fst, snd):
if set(re.findall('\w+', prev)).intersection(re.findall('\w+', curr)):
yield prev
with open('qwer.txt') as fin:
print list(grouper(fin))
# ['/n i also have a appartment and./n\n', '/nappartment good by the way/n\n', '/nby good fine/n\n', '/nfine is life/n\n']
然后适应您需要的任何特定输出。