Python:如何根据特定元素拆分列表

时间:2018-10-01 12:14:15

标签: python list indexing

如果我们在Python中有以下列表

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]

如何将其拆分以获取包含以句号结尾的元素的列表?所以我想在新列表中获取以下元素:

["I","am","good","."]
["I","like","you","."]
["we","are","not","friends","."]

到目前为止我的尝试:

cleaned_sentence = []
a = 0
while a < len(sentence):
    current_word = sentence[a]
    if current_word == "." and len(cleaned_sentence) == 0:
        cleaned_sentence.append(sentence[0:sentence.index(".")+1])
        a += 1
    elif current_word == "." and len(cleaned_sentence) > 0:
        sub_list = sentence[sentence.index(".")+1:-1]
        sub_list.append(sentence[-1])
        cleaned_sentence.append(sub_list[0:sentence.index(".")+1])
        a += 1
    else:
        a += 1

for each in cleaned_sentence:
    print(each)

sentence上运行会产生

['I', 'am', 'good', '.']
['I', 'like', 'you', '.']
['I', 'like', 'you', '.']

5 个答案:

答案 0 :(得分:5)

您可以使用itertools.groupby

from itertools import groupby
i = (list(g) for _, g in groupby(sentence, key='.'.__ne__))
print([a + b for a, b in zip(i, i)])

这将输出:

[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends', '.']]

如果列表并不总是以'.'结尾,那么您可以改用itertools.zip_longest

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends"]
i = (list(g) for _, g in groupby(sentence, key='.'.__ne__))
print([a + b for a, b in zip_longest(i, i, fillvalue=[])])

这将输出:

[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends']]

答案 1 :(得分:1)

使用简单的迭代。

演示:

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]
last = len(sentence) - 1
result = [[]]
for i, v in enumerate(sentence):
    if v == ".":
        result[-1].append(".")
        if i != last:
            result.append([])
    else:
        result[-1].append(v)
print(result)

输出:

[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends', '.']]

答案 2 :(得分:1)

sentence = ["I", "am", "good", ".", "I", "like", "you", ".", "we", "are", "not", "friends", "."]

output = []
temp = []
for item in sentence:
    temp.append(item)
    if item == '.':
        output.append(temp)
        temp = []
if temp:
    output.append(temp)

print(output)

答案 3 :(得分:0)

我们可以分两个阶段进行操作:首先计算点所在的索引,然后进行切片,例如:

idxs = [i for i, v in enumerate(sentence, 1) if v == '.']   # calculating indices

result = [sentence[i:j] for i, j in zip([0]+idxs, idxs)]    # splitting accordingly

然后产生:

>>> [sentence[i:j] for i, j in zip([0]+idxs, idxs)]
[['I', 'am', 'good', '.'], ['I', 'like', 'you', '.'], ['we', 'are', 'not', 'friends', '.']]

例如,您可以使用以下命令打印各个元素:

for sub in [sentence[i:j] for i, j in zip([0]+idxs, idxs)]:
    print(sub)

然后将打印:

>>> idxs = [i for i, v in enumerate(sentence, 1) if v == '.']
>>> for sub in [sentence[i:j] for i, j in zip([0]+idxs, idxs)]:
...     print(sub)
...
['I', 'am', 'good', '.']
['I', 'like', 'you', '.']
['we', 'are', 'not', 'friends', '.'] 

答案 4 :(得分:0)

此答案旨在成为最简单的答案...

数据

sentences = ["I", "am", "good", ".",
            "I", "like", "you", ".",
            "We", "are", "not", "friends", "."]

我们初始化输出列表,并表示我们正在开始输入新句子

l, start = [], 1

我们在数据列表上循环,使用w寻址当前单词

  • 如果我们在一个新句子的开头,则清除标记并在输出列表的末尾添加一个空列表
  • 我们将当前单词附加到最后一个子列表中(请注意①我们保证至少有一个最后一个子列表(您喜欢代词吗?),并且②每个单词都附加在后面)
  • 如果我们在结尾处–我们遇到了"." –我们再次升起旗帜。

请注意单个评论...

for w in sentences:
    if start: start = l.append([]) # l.append() returns None, that is falsey...
    l[-1].append(w)
    if w == ".": start = 1