我有一个句子列表。 以及另外几句话。
sentences=['this is first', 'this is one', 'this is three','this is four','this is five','this is six']
exceptions=['one','four']
我想遍历句子,如果一个句子以[exceptions]中包含的单词之一结尾,则与下一个句子串联。
结果:
sentences2=['this is first', 'this is one this is three','this is four this is five','this is six']
我无法进行任何可行的尝试。
我从循环开始,然后将列表转换为迭代器:
myter = iter(sentences)
然后尝试将句子连接起来,并将连接的句子附加到句子2中。
全部无济于事。
我的最后一次尝试是:
i=0
while True:
try:
if sentences[i].split(' ')[-1] in exceptions:
newsentence = sentence[i] + sentence[i+1]
sentences[i] = newsentence
sentences.pop(i+1)
i = i +1
else:
i=i+1
except:
break
print('\n-----\n'.join(sentences))
以某种方式给人的印象是我尝试使用错误的方法。
谢谢。
答案 0 :(得分:1)
您可以使用itertools中的sentences
将zip_longest
压缩成一个偏移量的片段。这样一来,您就可以执行检查,在需要时进行串联,而在不需要时跳过下一个迭代。
from itertools import zip_longest
sentences2 = []
skip = False
for s1, s2 in zip_longest(sentences, sentences[1:]):
if skip:
skip = False
continue
if s1.split()[-1].lower() in exceptions:
sentences2.append(f'{s1} {s2}')
skip = True
else:
sentences2.append(s1)
sentences2
# returns:
['this is first', 'this is one this is three', 'this is four this is five', 'this is six']
您需要处理连续连接多个句子的情况。对于这种情况,您可以使用标记来跟踪是否应该加入下一个句子。这有点麻烦,但是这里是:
sentences2 = []
join_next = False
candidate = None
for s in sentences:
if join_next:
candidate += (' ' + s)
join_next = False
if candidate is None:
candidate = s
if s.split()[-1].lower() in exceptions:
join_next = True
continue
else:
sentences2.append(candidate)
candidate = None
sentences2
# returns:
['this is first',
'this is one this is three',
'this is four this is five',
'this is six']
这里是一个示例,它添加了一个需要链式连接的额外句子。
sentences3 = ['this is first', 'this is one', 'extra here four',
'this is three', 'this is four', 'this is five', 'this is six']
sentences4 = []
join_next = False
candidate = None
for s in sentences3:
if join_next:
candidate += (' ' + s)
join_next = False
if candidate is None:
candidate = s
if s.split()[-1].lower() in exceptions:
join_next = True
continue
else:
sentences4.append(candidate)
candidate = None
sentences4
# returns:
['this is first',
'this is one extra here four this is three',
'this is four this is five',
'this is six']
答案 1 :(得分:0)
您可以在不以例外词结尾的句子中添加行尾字符(并在空格处添加其他字符),然后加入它们。 (然后在行尾拆分):
sentences=['this is first', 'this is one', 'this is three','this is four','this is five','this is six']
exceptions=['one','four']
result = "".join(s + "\n "[any(s.endswith(x) for x in exceptions)] for s in sentences).strip().split("\n")
print(result)
# ['this is first', 'this is one this is three', 'this is four this is five', 'this is six']
答案 2 :(得分:0)
您的解决方案仅在一种情况下有效:当一行中的两个句子以异常词结尾时。解决方案是在连接两个句子后不递增i
,因此它将在下一次迭代时检查组合句子的最后一个单词。
在连接句子时,还需要在句子之间留一个空格。
不要使用异常来检测何时到达末尾,只需适当限制i
。
sentences=['this is first', 'this is one', 'this is three','this is one', 'this is four','this is five','this is six']
exceptions=['one','four']
i=0
while i < len(sentences) - 1:
if sentences[i].split(' ')[-1] in exceptions:
newsentence = sentences[i] + " " + sentences[i+1]
sentences[i] = newsentence
sentences.pop(i+1)
else:
i=i+1
print('\n-----\n'.join(sentences))