如何按模式分割字符串?

时间:2019-05-27 20:00:31

标签: python regex string split regex-group

说我有以下字符串:

"Hello {name} how are you doing {today}?"

我希望将其拆分为:

['Hello ', '{name}', ' how are you doing ', '{today}', '?']

使用哪种正则表达式或任何其他方法?

因此,重点是使用大括号中的文本作为分隔符。

2 个答案:

答案 0 :(得分:2)

您可以使用此正则表达式将带括号的单词拆分字符串,并在结果中保留带括号的字符串,

({[^{}]+})

此正则表达式将使用{someword}拆分字符串,并保留用于拆分字符串的单词。

Regex Demo

尝试这些Python代码,

import re

s = 'Hello {name} how are you doing {today}?'
print(re.split(r'({[^{}]+})', s))

输出

['Hello ', '{name}', ' how are you doing ', '{today}', '?']

或者,使用此正则表达式并执行re.findall来获取所需的所有字符串会更容易,

{[^{}]+}|[^{}]+

Regex Demo using find

使用re.findall的Python代码,

import re
arr = ['Hello {name} how are you doing {today}?', '{name}a{text}']

for s in arr:
    print(re.findall(r'{[^{}]+}|[^{}]+', s))

输出

['Hello ', '{name}', ' how are you doing ', '{today}', '?']
['{name}', 'a', '{text}']

答案 1 :(得分:1)

此表达式使用三个捕获组并将它们分开:

(([\w\s]+)|(\{.+?\})|([?!.;:]+)

第一个捕获组用于单词和空格:

(([\w\s]+)

第二个是带有{}的子字符串:

(\{.+?\})

最后一个是标点符号:

([?!.;:]+)

我们可以将所需的任何字符添加到字符列表[]中。

测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"([\w\s]+)|(\{.+?\})|([?!.;:]+)"

test_str = "Hello {name} how are you doing {today}?"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

DEMO