说我有以下字符串:
"Hello {name} how are you doing {today}?"
我希望将其拆分为:
['Hello ', '{name}', ' how are you doing ', '{today}', '?']
使用哪种正则表达式或任何其他方法?
因此,重点是使用大括号中的文本作为分隔符。
答案 0 :(得分:2)
您可以使用此正则表达式将带括号的单词拆分字符串,并在结果中保留带括号的字符串,
({[^{}]+})
此正则表达式将使用{someword}
拆分字符串,并保留用于拆分字符串的单词。
尝试这些Python代码,
import re
s = 'Hello {name} how are you doing {today}?'
print(re.split(r'({[^{}]+})', s))
输出
['Hello ', '{name}', ' how are you doing ', '{today}', '?']
或者,使用此正则表达式并执行re.findall
来获取所需的所有字符串会更容易,
{[^{}]+}|[^{}]+
使用re.findall的Python代码,
import re
arr = ['Hello {name} how are you doing {today}?', '{name}a{text}']
for s in arr:
print(re.findall(r'{[^{}]+}|[^{}]+', s))
输出
['Hello ', '{name}', ' how are you doing ', '{today}', '?']
['{name}', 'a', '{text}']
答案 1 :(得分:1)
此表达式使用三个捕获组并将它们分开:
(([\w\s]+)|(\{.+?\})|([?!.;:]+)
第一个捕获组用于单词和空格:
(([\w\s]+)
第二个是带有{}
的子字符串:
(\{.+?\})
最后一个是标点符号:
([?!.;:]+)
我们可以将所需的任何字符添加到字符列表[]
中。
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"([\w\s]+)|(\{.+?\})|([?!.;:]+)"
test_str = "Hello {name} how are you doing {today}?"
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.