我有一个单词列表,在单词的开头和结尾都带有标点符号。我需要使用正则表达式分隔标点,如下所示:
sample_input = ["I", "!Go", "I'm", "call.", "exit?!"]
sample_output = ["I", "!", "Go", "I'm", "call", ".", "exit", "?", "!"]
原始字符串如下:
string ="It's a mountainous wonderland decorated with ancient glaciers, breathtaking national parks and sumptuous vineyards, but behind its glossy image New Zealand is failing many of its children."
有人有一个主意,如何解决这个问题?
谢谢。
答案 0 :(得分:0)
您可以先通过以下方式标记每个列表项:
import re
words = ["I", "!Go", "I'm", "call.", "exit?!"]
newwords = []
for i in words:
newwords.append(re.findall(r"[\w']+|[\W]", i))
print newwords
>>>[['I'], ['!', 'Go'], ["I'm"], ['call', '.'], ['exit', '?', '!']]
然后通过以下方式获取结果
:result= [item for sublist in newwords for item in sublist]
print result
>>>['I', '!', 'Go', "I'm", 'call', '.', 'exit', '?', '!']
您需要使用\w'
或使用\W
组来破坏每个字符串,以根据所需的输出获得最终列表。
您可以根据您的代码要求使用这种方法进行编写。