在列表中,如何将每个字符串(带有混合特殊字符)分隔为单个字符?

时间:2018-10-16 07:56:14

标签: python string python-3.x list nltk

假设,我想将以下列表分成单个字符。

mylist = [('dog', 'camel'), ('horse'), ('List_of_people_saved_by_Oskar'), 'mouse_bear', 'lion tiger rabbit', 'ant']

这是我到目前为止尝试过的:

L1 = [animal for word in mylist for animal in word.split('_')]
print(L1)

输出应如下所示:

`['dog', 'camel', 'horse', 'List', 'of', 'people', 'saved', 'by', 'Oskar', 'mouse', 'bear', 'lion', 'tiger' 'rabbit', 'ant']`

但是我遇到一个错误:

AttributeError: 'tuple' object has no attribute 'split'

3 个答案:

答案 0 :(得分:2)

您可以使用re.findall(r'[^_ ]+', word)来分隔下划线或空格分隔的单词。还要添加另一个理解层以平整可能的字符串元组:

import re
L1 = [animal for item in mylist for word in (item if isinstance(item, (tuple, list)) else (item,)) for animal in re.findall(r'[^_ ]+', word)]

L1将变为:

['dog', 'camel', 'horse', 'List', 'of', 'people', 'saved', 'by', 'Oskar', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']

答案 1 :(得分:1)

您只是混淆了去哪里。

[animal.split('_') for word in mylist for animal in word]

还有一个额外的问题,就是("horse")不是元组; ("horse",)是。因此,("horse")只是括号内的"horse",而for animal in word会枚举"horse"中的各个字母,而不是给您退回一只"horse"动物。

如果希望除_以外的其他字符分割,可以使用re.split和一个字符类:

import re
[re.split(r'[_ ]', animal) for word in mylist for animal in word]

如果您实际上打算使非配对动物成为元组,那么您将必须专门处理这些情况:

[re.split(r'[_ ]', animal)
    for word in mylist
    for animal in (word if isinstance(word, tuple) else (word,))]

答案 2 :(得分:1)

这是一个更具可读性的代码,因为我实在不喜欢拥有内联代码的想法,无论它有多高效或更快。另外,这可能使您更容易理解,并且不需要导入库。

代码:

mylist = [('dog', 'camel'), ('horse'), ('List_of_people_saved_by_Oskar'), 'mouse_bear', 'lion tiger rabbit', 'ant']
new_list = []

for items in mylist:
    if type(items) == tuple:
        for animals in items:
            new_list.append(animals)
    elif '_' in items:
        new_animal = items.split('_')
        for animals in new_animal:
            new_list.append(animals)

    elif ',' in items:
        new_animal = items.split(',')
        for animals in new_animal:
            new_list.append(animals)

    elif ' ' in items:
        new_animal = items.split(' ')
        for animals in new_animal:
            new_list.append(animals)
    else:
        new_list.append(items)
print(new_list)

输出:

['dog', 'camel', 'horse', 'List', 'of', 'people', 'saved', 'by', 'Oskar', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']