Question

我有以下输入字符串：

text='''Although never is often better than *right* now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!'''

到目前为止，我已经将text字符串拆分为list，如下所示：

list=['Although', 'never', 'is', 'often', 'better', 'than', '*right*', 'now.\n\nIf', 'the', 'implementation', 'is', 'hard', 'to', 'explain,', "it's", 'a', 'bad', 'idea.\n\nIf', 'the', 'implementation', 'is', 'easy', 'to', 'explain,', 'it', 'may', 'be', 'a', 'good', 'idea.\n\nNamespaces', 'are', 'one', 'honking', 'great','idea', '--', "let's", 'do', 'more', 'of', 'those!']

现在，我想使用strip函数从上面的列表中删除不需要的字符，例如\n\n和--。

您能帮我吗？

Answer 1

使用re模块，re.sub函数将允许您执行此操作。我们需要用单个\n替换出现多个\n的情况，并删除--字符串

import re

code='''Although never is often better than right now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!'''


result = re.sub('\n{2,}', '\n', code)
result = re.sub(' -- ', ' ', result)

print(result)

在split（）之后输入文字。

Answer 2

这将使用空格或换行符分隔字符串

import re

output = [i for i in re.split(r'\s|\n{1:2}|--', code) if i]

Answer 3

您可以使用列表理解来摆脱--

>>> code='''Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!'''
>>> 
>>> [word for word in code.split() if word != '--']
['Although', 'never', 'is', 'often', 'better', 'than', 'right', 'now.', 'If', 'the', 'implementation', 'is', 'hard', 'to', 'explain,', "it's", 'a', 'bad', 'idea.', 'If', 'the', 'implementation', 'is', 'easy', 'to', 'explain,', 'it', 'may', 'be', 'a', 'good', 'idea.', 'Namespaces', 'are', 'one', 'honking', 'great', 'idea', "let's", 'do', 'more', 'of', 'those!']

如何从python的字符串列表中剥离多个不需要的字符？

3 个答案: