正则表达式加载整数数目的“键值”对,例如3 key1“ value1” key2“ value2” key3“ value3”

时间:2019-05-28 15:26:45

标签: regex

我正在努力创建一个正则表达式来解析包含整数后的整数的行,可以使其大部分正常工作,但不适用于整数为零且没有后续值的情况。

例如

..... 2 "value1" "value2" "someother non-related text"
..... 0 "someother non-related text"

在整数或

之后还有整数个以空格分隔的键值对
..... 3 key1 "value1" key2 "value2" key3 "value3"......

很高兴将它们塞入一个命名的组中,但是稍后将它们放入单独的命名组中可能会有用。

3 "value1" "value2" "value3" "someother non-related text"

(?<my_named_group>([0])|[0-9] (?<my_values>(".*"?)?))

my_named_group = 3
my_values = '"value1" "value2" "value3"'

当整数为零时

my_named_group = 0
my_values = ""

第二个问题/正则表达式

3 key1 "value1" key2 "value2" key3 "value3" "someother non-related text"

my_named_group = 3
my_values = 'key 1 "value1" key 2 "value2" key3 "value3"'

1 个答案:

答案 0 :(得分:0)

如果我理解正确,我们会在数字后面加上一些引号,并且可以使用一个简单的表达式来解决它:

([0-9]+).+?(\".*\")

所需的数字在第一个捕获组([0-9]+)中,另一个所需的子字符串在第二个捕获组(\".*\")中。

测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"([0-9]+).+?(\".*\")"

test_str = ("2 \"value1\" \"value2\" \"someother non-related text\"\n"
    "0 \"someother non-related text\"\n"
    "3 key1 \"value1\" key2 \"value2\" key3 \"value3\"")

subst = "\\1\\n\\2"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

DEMO