正则表达式只能匹配两种类型的带引号的字符串之一

时间:2019-07-17 19:57:04

标签: c# regex double-quotes single-quotes

我需要一个正则表达式来匹配用双引号引起来的字符串。如果此模式用单引号引起来,则它不应与用双引号引起来的字符串匹配:

"string"
" 'xyz' "
"  `"    "
"  `" `"   "
"  `" `" `"  "
'  ' "should match" '  '
'   "should not match"   '

现在我有(https://regex101.com/r/z5PayV/1

(?:"(([^"]*`")*[^"]*|[^"]*)") 

匹配所有行。但是最后一行不应该匹配。有解决办法吗?

2 个答案:

答案 0 :(得分:3)

您必须经过单引号才能将其排除在匹配范围之外

更新

对于C#,必须像这样完成。
只需使用简单的CaptureCollection即可获取所有内容
引用的匹配项。

(?:'[^']*'|(?:"(([^"]*`")*[^"]*|[^"]*)")|[\S\s])+

扩展

 (?:
      ' [^']* '

   |  
      (?:
           "
           (                             # (1 start)
                ( [^"]* `" )*                 # (2)
                [^"]* 
             |  [^"]* 
           )                             # (1 end)
           "
      )
   |  
      [\S\s] 
 )+

C#代码

var str =
"The two sentences are 'He said \"Hello there\"' and \"She said 'goodbye' and 'another sentence'\"\n" +
"\"  `\"    \"\n" +
"\"  `\"    \"\n" +
"\"  `\" `\"   \"\n" +
"\"  `\" `\" `\"  \"\n" +
"'   \"   \"   '\n" +
"\"string\"\n" +
"\" 'xyz' \"\n" +
"\"  `\"    \"\n" +
"\"  `\" `\"   \"\n" +
"\"  `\" `\" `\"  \"\n" +
"'  ' \"should match\" '  '\n" +
"'   \"should not match\"   '\n";

var rx = new Regex( "(?:'[^']*'|(?:\"(([^\"]*`\")*[^\"]*|[^\"]*)\")|[\\S\\s])+" );

Match M = rx.Match( str );
if (M.Success)
{
    CaptureCollection cc = M.Groups[1].Captures;
    for (int i = 0; i < cc.Count; i++)
        Console.WriteLine("{0}", cc[i].Value);
}

输出

She said 'goodbye' and 'another sentence'
  `"
  `"
  `" `"
  `" `" `"
string
 'xyz'
  `"
  `" `"
  `" `" `"
should match

对不起,这是在PCRE引擎中完成的方式

'[^']*'(*SKIP)(*FAIL)|(?:"(([^"]*`")*[^"]*|[^"]*)")`

https://regex101.com/r/gMiVDU/1

   ' [^']* '
   (*SKIP) (*FAIL) 
|  
   (?:
        "
        (                             # (1 start)
             ( [^"]* `" )*                 # (2)
             [^"]* 
          |  [^"]* 
        )                             # (1 end)
        "
   )

___________________________-

答案 1 :(得分:0)

答案看起来很复杂,这是怎么回事:

^“(\ d + | \ D +)” $

这太简单了吗?

这里的想法是检查字符串的开头和结尾是否带有双引号(“),双引号内的任何内容(包括单引号)都是允许的。