正则表达式查找字符串C#

时间:2012-10-30 09:09:49

标签: c# regex asp.net-4.0

我想从我的字符串中匹配<word>~<word>~0.1<word>~0.9等字符串。

但如果它在"<word>~0.5""<word>"~0.5等双引号内,则不应匹配。

一些例子:

"World Terror"~10 AND music~0.5                --> should match music~0.5
"test~ my string"                              --> should not match
music~ AND "song remix" AND "world terror~0.5" --> should match music~

我现在已经在regex下面应用\w+~,但如果匹配包含在引号内,它也会匹配。

可以请任何人帮助我吗?

1 个答案:

答案 0 :(得分:2)

这适用于不包含转义引号的字符串(因为这会使得偶数引号的计数失去平衡):

Regex regexObj = new Regex(
    @"\w+~[\d.]*  # Match an alnum word, tilde, optional digits/dots
    (?=           # only if there follows...
     [^""]*       # any number of non-quotes
     (?:          # followed by...
      ""[^""]*    # one quote, and any number of non-quotes
      ""[^""]*    # another quote, and any number of non-quotes
     )*           # any number of times, ensuring an even number of quotes
     [^""]*       # Then any number of non-quotes
     $            # until the end of the string.
    )             # End of lookahead assertion", 
    RegexOptions.IgnorePatternWhitespace);

如果需要解决转义引号,则会更加复杂:

Regex regexObj = new Regex(
    @"\w+~[\d.]*         # Match an alnum word, tilde, optional digits/dots
    (?=                  # only if there follows...
     (?:\\.|[^\\""])*    # any number of non-quotes (or escaped quotes)
     (?:                 # followed by...
      ""(?:\\.|[^\\""])* # one quote, and any number of non-quotes
      ""(?:\\.|[^\\""])* # another quote, and any number of non-quotes
     )*                  # any number of times, ensuring an even number of quotes
     (?:\\.|[^\\""])*    # Then any number of non-quotes
     $                   # until the end of the string.
    )                    # End of lookahead assertion", 
    RegexOptions.IgnorePatternWhitespace);