我正在努力创建一个正则表达式来解析包含整数后的整数的行,可以使其大部分正常工作,但不适用于整数为零且没有后续值的情况。
例如
..... 2 "value1" "value2" "someother non-related text"
..... 0 "someother non-related text"
在整数或
之后还有整数个以空格分隔的键值对..... 3 key1 "value1" key2 "value2" key3 "value3"......
很高兴将它们塞入一个命名的组中,但是稍后将它们放入单独的命名组中可能会有用。
3 "value1" "value2" "value3" "someother non-related text"
(?<my_named_group>([0])|[0-9] (?<my_values>(".*"?)?))
my_named_group = 3
my_values = '"value1" "value2" "value3"'
当整数为零时
my_named_group = 0
my_values = ""
第二个问题/正则表达式
3 key1 "value1" key2 "value2" key3 "value3" "someother non-related text"
my_named_group = 3
my_values = 'key 1 "value1" key 2 "value2" key3 "value3"'
答案 0 :(得分:0)
如果我理解正确,我们会在数字后面加上一些引号,并且可以使用一个简单的表达式来解决它:
([0-9]+).+?(\".*\")
所需的数字在第一个捕获组([0-9]+)
中,另一个所需的子字符串在第二个捕获组(\".*\")
中。
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"([0-9]+).+?(\".*\")"
test_str = ("2 \"value1\" \"value2\" \"someother non-related text\"\n"
"0 \"someother non-related text\"\n"
"3 key1 \"value1\" key2 \"value2\" key3 \"value3\"")
subst = "\\1\\n\\2"
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.