正则表达式在字符结尾处开始分割

时间:2014-07-08 09:50:56

标签: python regex

我再次尝试研究如何在Python中拆分字符串,其格式如下:

'aaaa bbbb cccc:dd eeee:ff ggg hhhh iiii:jjjj kkkk:llll:mm nnn:ooo pppp qqqq:rrr'

进入以下列表项目:

'aaaa bbbb' 
'cccc:dd'
'eeee:ff ggg hhhh'
'iiii:jjjj'
'kkkk:
'llll:mm'
'nnn:ooo pppp'
'qqqq:rrr'

我希望在以冒号(':')

结尾的单词的开头处分开

任何建议都会非常感激:)

1 个答案:

答案 0 :(得分:0)

以下为所提供的示例工作:

import re

string = 'aaaa bbbb cccc:dd eeee:ff ggg hhhh iiii:jjjj kkkk:llll:mm nnn:ooo pppp qqqq:rrr'
result = []

# split the string at each word followed by a colon
# wrap regex pattern as group so it is added to result list
parts = re.split("(\w+:)", string)

# if anything was previous to first delimitation token
# add it to results
if parts[0]:
    result.append(parts[0].strip())

# create pairs of a delimitation token and next string
# start from first delimitation token (list index 1)
groups = zip(*[parts[i+1::2] for i in range(2)])

# join each pair to one string and strip spacing
result.extend(["".join(group).strip() for group in groups])

print(result)