我有以下输入文字:
@"This is some text @foo=bar @name=""John \""The Anonymous One\"" Doe"" @age=38"
我想用@ name = value语法解析值作为名称/值对。解析前一个字符串应该会产生以下命名的捕获:
name:"foo"
value:"bar"
name:"name"
value:"John \""The Anonymous One\"" Doe"
name:"age"
value:"38"
我尝试了以下正则表达式,它让我几乎那里:
@"(?:(?<=\s)|^)@(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>[A-Za-z0-9_-]+|(?="").+?(?=(?<!\\)""))"
主要问题是它捕获"John \""The Anonymous One\"" Doe"
中的开头报价。我觉得这应该是一个后视而不是前瞻,但这似乎根本不起作用。
以下是表达式的一些规则:
名称必须以字母开头,并且可以包含任何字母,数字,下划线或连字符。
不带引号必须至少包含一个字符,并且可以包含任何字母,数字,下划线或连字符。
引用值可以包含任何字符,包括任何空格和转义引号。
编辑:
这是regex101.com的结果:
(?:(?<=\s)|^)@(?<name>\w+[A-Za-z0-9_-]+?)\s*=\s*(?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)"))
(?:(?<=\s)|^) Non-capturing group
@ matches the character @ literally
(?<name>\w+[A-Za-z0-9_-]+?) Named capturing group name
\s* match any white space character [\r\n\t\f ]
= matches the character = literally
\s* match any white space character [\r\n\t\f ]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?<value>(?<!")[A-Za-z0-9_-]+|(?=").+?(?=(?<!\\)")) Named capturing group value
1st Alternative: [A-Za-z0-9_-]+
[A-Za-z0-9_-]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
A-Z a single character in the range between A and Z (case sensitive)
a-z a single character in the range between a and z (case sensitive)
0-9 a single character in the range between 0 and 9
_- a single character in the list _- literally
2nd Alternative: (?=").+?(?=(?<!\\)")
(?=") Positive Lookahead - Assert that the regex below can be matched
" matches the characters " literally
.+? matches any character (except newline)
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
(?=(?<!\\)") Positive Lookahead - Assert that the regex below can be matched
(?<!\\) Negative Lookbehind - Assert that it is impossible to match the regex below
\\ matches the character \ literally
" matches the characters " literally
答案 0 :(得分:1)
您可以使用非常有用的.NET正则表达式功能,其中允许多个同名的捕获。此外,您的(?<name>)
捕获组存在问题:它允许第一个位置的数字,这不符合您的第一个要求。
所以,我建议:
(?si)(?:(?<=\s)|^)@(?<name>\w+[a-z0-9_-]+?)\s*=\s*(?:(?<value>[a-z0-9_-]+)|(?:"")?(?<value>.+?)(?=(?<!\\)""))
请参阅demo
请注意,您无法在regex101.com上调试特定于.NET的正则表达式,您需要在符合.NET的环境中对它们进行测试。
答案 1 :(得分:0)
使用字符串方法。
<强>分割强>
string myLongString = ""@"This is some text @foo=bar @name=""John \""The Anonymous One\"" Doe"" @age=38"
string[] nameValues = myLongString.Split('@');
从那里使用分割功能&#34; =&#34;或使用 IndexOf(&#34; =&#34;)。