我有以下输入字符串:
text='''Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!'''
到目前为止,我已经将text
字符串拆分为list
,如下所示:
list=['Although', 'never', 'is', 'often', 'better', 'than', '*right*', 'now.\n\nIf', 'the', 'implementation', 'is', 'hard', 'to', 'explain,', "it's", 'a', 'bad', 'idea.\n\nIf', 'the', 'implementation', 'is', 'easy', 'to', 'explain,', 'it', 'may', 'be', 'a', 'good', 'idea.\n\nNamespaces', 'are', 'one', 'honking', 'great','idea', '--', "let's", 'do', 'more', 'of', 'those!']
现在,我想使用strip
函数从上面的列表中删除不需要的字符,例如\n\n
和--
。
您能帮我吗?
答案 0 :(得分:0)
使用re
模块,re.sub
函数将允许您执行此操作。
我们需要用单个\n
替换出现多个\n
的情况,并删除--
字符串
import re
code='''Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!'''
result = re.sub('\n{2,}', '\n', code)
result = re.sub(' -- ', ' ', result)
print(result)
在split()之后输入文字。
答案 1 :(得分:0)
这将使用空格或换行符分隔字符串
import re
output = [i for i in re.split(r'\s|\n{1:2}|--', code) if i]
答案 2 :(得分:0)
您可以使用列表理解来摆脱--
>>> code='''Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!'''
>>>
>>> [word for word in code.split() if word != '--']
['Although', 'never', 'is', 'often', 'better', 'than', 'right', 'now.', 'If', 'the', 'implementation', 'is', 'hard', 'to', 'explain,', "it's", 'a', 'bad', 'idea.', 'If', 'the', 'implementation', 'is', 'easy', 'to', 'explain,', 'it', 'may', 'be', 'a', 'good', 'idea.', 'Namespaces', 'are', 'one', 'honking', 'great', 'idea', "let's", 'do', 'more', 'of', 'those!']