我想从我的字符串中匹配<word>~
,<word>~0.1
,<word>~0.9
等字符串。
但如果它在"<word>~0.5"
或"<word>"~0.5
等双引号内,则不应匹配。
一些例子:
"World Terror"~10 AND music~0.5 --> should match music~0.5
"test~ my string" --> should not match
music~ AND "song remix" AND "world terror~0.5" --> should match music~
我现在已经在regex下面应用\w+~
,但如果匹配包含在引号内,它也会匹配。
可以请任何人帮助我吗?
答案 0 :(得分:2)
这适用于不包含转义引号的字符串(因为这会使得偶数引号的计数失去平衡):
Regex regexObj = new Regex(
@"\w+~[\d.]* # Match an alnum word, tilde, optional digits/dots
(?= # only if there follows...
[^""]* # any number of non-quotes
(?: # followed by...
""[^""]* # one quote, and any number of non-quotes
""[^""]* # another quote, and any number of non-quotes
)* # any number of times, ensuring an even number of quotes
[^""]* # Then any number of non-quotes
$ # until the end of the string.
) # End of lookahead assertion",
RegexOptions.IgnorePatternWhitespace);
如果需要解决转义引号,则会更加复杂:
Regex regexObj = new Regex(
@"\w+~[\d.]* # Match an alnum word, tilde, optional digits/dots
(?= # only if there follows...
(?:\\.|[^\\""])* # any number of non-quotes (or escaped quotes)
(?: # followed by...
""(?:\\.|[^\\""])* # one quote, and any number of non-quotes
""(?:\\.|[^\\""])* # another quote, and any number of non-quotes
)* # any number of times, ensuring an even number of quotes
(?:\\.|[^\\""])* # Then any number of non-quotes
$ # until the end of the string.
) # End of lookahead assertion",
RegexOptions.IgnorePatternWhitespace);