例如:
a = ['The', 'man', 'is', 'eating', 'pear']
eating
和pear
是连续的
b = ['these', 'are', 'random', 'words', 'but', 'they', 'have', 'pear', 'and', 'eating']
这是一个随机的单词列表,我想检查两个CONSECUTIVE单词是否是b中的单词
我如何制作像
这样的列表c = ['eating', 'pear']
答案 0 :(得分:5)
c = [(x,y) for x, y in zip(a[0:], a[1:]) if x in b and y in b]
print(c)
答案 1 :(得分:1)
循环将
a = ['The', 'man', 'is', 'eating', 'pear', "these", "words", "mean", "nothing", "but", "words"]
b = ['these', 'are', 'random', 'words', 'but', 'they', 'have', 'pear', 'and', 'eating']
#make b a set to improve lookup times
set_b = set(b)
#list for the words found
consec = []
for i, item in enumerate(a[:-1]):
#check consecutive words
if item in set_b and a[i + 1] in set_b:
#append pair if both words are in b
consec.extend(a[i:i + 2])
#remove double entries by converting the list to a set
print(set(consec))
#output is a set of words
#{'pear', 'words', 'eating', 'these', 'but'}
如果应保留a
的词序,您可以执行以下操作:
a = ['The', 'man', 'is', 'eating', 'pear', "these", "mean", "nothing", "but", "words"]
b = ['these', 'are', 'random', 'words', 'but', 'they', 'have', 'pear', 'and', 'eating']
set_b = set(b)
consec = []
for i, item in enumerate(a[:-1]):
if item in set_b and a[i + 1] in set_b:
#first word already in list?
if item in consec:
#include only second word
consec.append(a[i + 1])
else:
#add the pair of words
consec.extend(a[i:i + 2])
print(consec)
#output
#['eating', 'pear', 'these', 'but', 'words']