根据条件将列表元素合并为新列表

时间:2019-06-02 19:54:52

标签: python string list

我有一个句子列表。 以及另外几句话。

sentences=['this is first', 'this is one', 'this is three','this is four','this is five','this is six']
exceptions=['one','four']

我想遍历句子,如果一个句子以[exceptions]中包含的单词之一结尾,则与下一个句子串联。

结果:

sentences2=['this is first', 'this is one this is three','this is four this is five','this is six']

我无法进行任何可行的尝试。

我从循环开始,然后将列表转换为迭代器:

myter = iter(sentences)

然后尝试将句子连接起来,并将连接的句子附加到句子2中。

全部无济于事。

我的最后一次尝试是:

i=0
while True:
    try:
        if sentences[i].split(' ')[-1] in exceptions:
            newsentence = sentence[i] + sentence[i+1]
            sentences[i] = newsentence
            sentences.pop(i+1)
            i = i +1
        else:
            i=i+1
    except:
        break

 print('\n-----\n'.join(sentences))

以某种方式给人的印象是我尝试使用错误的方法。

谢谢。

3 个答案:

答案 0 :(得分:1)

您可以使用itertools中的sentenceszip_longest压缩成一个偏移量的片段。这样一来,您就可以执行检查,在需要时进行串联,而在不需要时跳过下一个迭代。

from itertools import zip_longest

sentences2 = []
skip = False
for s1, s2 in zip_longest(sentences, sentences[1:]):
    if skip:
        skip = False
        continue
    if s1.split()[-1].lower() in exceptions:
        sentences2.append(f'{s1} {s2}')
        skip = True
    else:
        sentences2.append(s1)

sentences2
# returns:
['this is first', 'this is one this is three', 'this is four this is five', 'this is six']

编辑:

您需要处理连续连接多个句子的情况。对于这种情况,您可以使用标记来跟踪是否应该加入下一个句子。这有点麻烦,但是这里是:

sentences2 = []
join_next = False
candidate = None
for s in sentences:
    if join_next:
        candidate += (' ' + s)
        join_next = False
    if candidate is None:
        candidate = s
    if s.split()[-1].lower() in exceptions:
        join_next = True
        continue
    else:
        sentences2.append(candidate)
        candidate = None

sentences2
# returns:
['this is first',
 'this is one this is three',
 'this is four this is five',
 'this is six']

这里是一个示例,它添加了一个需要链式连接的额外句子。

sentences3 = ['this is first', 'this is one', 'extra here four', 
              'this is three', 'this is four', 'this is five', 'this is six']

sentences4 = []
join_next = False
candidate = None
for s in sentences3:
    if join_next:
        candidate += (' ' + s)
        join_next = False
    if candidate is None:
        candidate = s
    if s.split()[-1].lower() in exceptions:
        join_next = True
        continue
    else:
        sentences4.append(candidate)
        candidate = None

sentences4
# returns:
['this is first',
 'this is one extra here four this is three',
 'this is four this is five',
 'this is six']

答案 1 :(得分:0)

您可以在不以例外词结尾的句子中添加行尾字符(并在空格处添加其他字符),然后加入它们。 (然后在行尾拆分):

 sentences=['this is first', 'this is one', 'this is three','this is four','this is five','this is six']
 exceptions=['one','four']

 result = "".join(s + "\n "[any(s.endswith(x) for x in exceptions)] for s in sentences).strip().split("\n")
 print(result)

 # ['this is first', 'this is one this is three', 'this is four this is five', 'this is six']

答案 2 :(得分:0)

您的解决方案仅在一种情况下有效:当一行中的两个句子以异常词结尾时。解决方案是在连接两个句子后递增i,因此它将在下一次迭代时检查组合句子的最后一个单词。

在连接句子时,还需要在句子之间留一个空格。

不要使用异常来检测何时到达末尾,只需适当限制i

sentences=['this is first', 'this is one', 'this is three','this is one', 'this is four','this is five','this is six']
exceptions=['one','four']
i=0
while i < len(sentences) - 1:
    if sentences[i].split(' ')[-1] in exceptions:
        newsentence = sentences[i] + " " + sentences[i+1]
        sentences[i] = newsentence
        sentences.pop(i+1)
    else:
        i=i+1

print('\n-----\n'.join(sentences))