Question

我正在尝试使用re.split来更容易地判断在我创建的程序中使用的公式是否有效，我想我几乎在那儿，但是不能让它停在第二个双引号上在第二个示例中，您好：

import re

s = """c2+"hello"+c4"""
x = re.split("(\".+\"|\+)",s)
# output is correct here ['c2', '+', '', '"hello"', '', '+', 'c4']


# but not here:
s = """c2+"hello""+"c4"""
x = re.split("(\".+\"|\+)",s)
# current output ['c2', '+', '', '"hello""+"', 'c4']
# desired output ['c2', '+', '', '"hello"', '"+"', 'c4']

Answer 1

您可以使用.+?使双引号内的部分不贪心：

import re

s = """c2+"hello""+"c4"""
x = re.split("(\".+?\"|\+)",s)
print(x)
# ['c2', '+', '', '"hello"', '', '"+"', 'c4']

请注意，您的预期输出有所不同：“ hello”和“ +”之间的空字符串。这是有目的的，因此分隔符始终出现在结果列表中的奇数索引处（字符串的其他部分在偶数索引处）。

简单文本公式的Python正则表达式

1 个答案: