我有以下列表列表。
mycookbook= [["i", "love", "tim", "tam", "and", "ice", "cream"], ["cooking",
"fresh", "vegetables", "is", "easy"], ["fresh", "vegetables", "are", "good",
"for", "health"]]
我还有一个如下列表。
mylist = ["tim tam", "ice cream", "fresh vegetables"]
现在,我想找到mylist
中的连续字词并将它们组合起来更新mycookbook
。
我目前正在执行以下操作。
for sentence in mycookbook:
for sub in sentence:
if sub is (mylist[0].split(" ")[0]):
但我不知道如何检测下一个单词,因为没有命令next()
。请帮帮我。
答案 0 :(得分:0)
你想要遍历指数,每次都尽可能地向前看。所以,像这样:
new_sentence = []
index = 0
while index < len(sentence):
for word in mylist:
wordlist = word.split()
if sentence[index:][:len(wordlist)] == wordlist: # This will take the first `len(wordlist)` elements and see if it's a match
new_sentence.append(word)
index += len(wordlist)
break
else:
new_sentence.append(sentence[index])
index += 1
您可以在此处试用:Try it Online!
答案 1 :(得分:0)
您可以遍历原始mycookbook
中的每个句子。然后,对于每个句子,从指针指向第一个单词开始。
案例1:如果sentence[i] + ' ' + sentence[i+1]
不在mylist
,我们只需将sentence[i]
添加到新句子中。
案例2:如果sentence[i] + ' ' + sentence[i+1]
位于mylist
,则将其作为一个单词添加到新句子中,并将指针向前移动2步。
以下示例。
mycookbook= [["i", "love", "tim", "tam", "and", "ice", "cream"], ["cooking",
"fresh", "vegetables", "is", "easy"], ["fresh", "vegetables", "are", "good",
"for", "health"]]
mylist = ["tim tam", "ice cream", "fresh vegetables"]
mycookbook_new = []
for sentence in mycookbook:
i = 0
sentence_new = []
while i < len(sentence):
if (i == len(sentence)-1 or sentence[i] + ' ' + sentence[i+1] not in mylist):
sentence_new.append(sentence[i]) # unchanged
i += 1
else:
sentence_new.append(sentence[i] + ' ' + sentence[i+1])
i += 2
mycookbook_new.append(sentence_new)
print(mycookbook_new)
'''
[
['i', 'love', 'tim tam', 'and', 'ice cream'],
['cooking', 'fresh vegetables', 'is', 'easy'],
['fresh vegetables', 'are', 'good', 'for', 'health']
]
'''
答案 2 :(得分:0)
mycookbook= [["i", "love", "tim", "tam", "and", "ice", "cream"], ["cooking",
"fresh", "vegetables", "is", "easy"], ["fresh", "vegetables", "are", "good",
"for", "health"]]
mylist = ["tim tam", "ice cream", "fresh vegetables"]
result_cookbook = []
for cb in mycookbook:
cook_book = []
need_continue = False
for index, word in enumerate(cb):
if need_continue:
need_continue = False
continue
if index < len(cb) - 1:
# can combine with next word
combine_word = "{} {}".format(cb[index], cb[index+1])
if combine_word in mylist:
cook_book.append(combine_word)
need_continue = True
else:
cook_book.append(word)
else:
cook_book.append(word)
result_cookbook.append(cook_book)
print result_cookbook
答案 3 :(得分:0)
使用zip
对下一个工作中的每个单词对进行迭代。如果单词对在mylist
中,则将其作为单个sting追加并跳过下一次迭代。
out = []
for sentence in mycookbook:
new_sentence = []
skip = False
for pairs in zip(sentence, sentence[1:]+['']):
if skip:
skip = False
continue
if ' '.join(pairs) in mylist:
new_sentence.append(' '.join(pairs))
skip = True
else:
new_sentence.append(pairs[0])
out.append(new_sentence)
答案 4 :(得分:0)
for sentence in mycookbook:
i = 0
while i < len(sentence) - 2:
for m in mylist:
words = m.split(' ')
if sentence[i] == words[0]:
for j in range(1, len(words)):
if sentence[i + 1] != words[j]:
break
sentence[i] += ' ' + words[j]
sentence.pop(i + 1)
i += 1
答案 5 :(得分:0)
更易阅读的版本分为更小的功能。
itertools.zip
或range
pop
,append
+=
def as_pairs(iterable):
"""
yields two items at a time from iterable
"""
iterator = iter(iterable)
try:
current_item = next(iterator)
while True:
next_item = next(iterator)
yield current_item, next_item
current_item = next_item
except StopIteration:
return
def merge_pairs(pair_words, word_list):
"""
If the pair words are part of the word_list, merges them to one
"""
pair_map = { tuple(pair_word.split(" ")) : pair_word for pair_word in pair_words }
for pair in as_pairs(word_list):
if pair in pair_map:
yield pair_map.get(pair)
else:
first, second = pair
yield first
def main():
mycookbook= [
["i", "love", "tim", "tam", "and", "ice", "cream"],
["cooking", "fresh", "vegetables", "is", "easy"],
["fresh", "vegetables", "are", "good", "for", "health"]
]
mylist = ["tim tam", "ice cream", "fresh vegetables"]
return [ list(merge_pairs(mylist, sentence)) for sentence in mycookbook ]
print(main())
[[&#39; i&#39;,&#39; love&#39;,&#39; tim tam&#39;,&#39; tam&#39;,&#39;和&#39; ,&#39;冰淇淋&#39;],[&#39;烹饪&#39;, &#39;新鲜蔬菜&#39;蔬菜&#39;,&#39;],&#39;新鲜蔬菜&#39;, &#39;蔬菜&#39;,&#39;,&#39; good&#39;,&#39; for&#39;]]
答案 6 :(得分:0)
这是一个解决方案。如果你关心性能,应该以某种方式索引mylist,这样匹配函数可以比顺序查找更好。
奖励:mylist中的条目可以包含任意数量的单词,而不仅仅是两个单词,通知添加“对健康有益”。
mycookbook= [["i", "love", "tim", "tam", "and", "ice", "cream"], ["cooking",
"fresh", "vegetables", "is", "easy"], ["fresh", "vegetables", "are", "good",
"for", "health"]]
mylist = ["tim tam", "ice cream", "fresh vegetables", "good for health"]
def transform(x):
def match(i):
for e in mylist:
el = e.split()
if x[i:i+len(el)] == el:
return e, len(el)
return x[i], 1
i = 0
while i < len(x):
e, l = match(i)
yield e
i += l
answer = [list(transform(x)) for x in mycookbook]
print(answer)
'''
[['i', 'love', 'tim tam', 'and', 'ice cream'],
['cooking', 'fresh vegetables', 'is', 'easy'],
['fresh vegetables', 'are', 'good for health']]
'''