Question

我想编写一个正则表达式，它可以匹配字符串文字的以下规范。在过去的10个小时里，我疯狂地制定了各种正常的表达方式，似乎没有用。最后，我已经归结为这一个：

([^"]|(\\[.\n]))*\"

基本上，要求如下：

必须匹配一个字符串文字，所以我匹配到最后一个“，之间可能有一个”，它不应该结束字符串。
我们也可以通过'\'
只有未转义的'''字符可以结束比赛，没有别的。

我需要正确匹配的一些示例字符串如下：

\ a \ b \“\ n”=＆gt;我应该匹配以下字符'\'，'a'，'\'，'b'，'\'，'''，'\'，'n'，'“'
\“这仍然在字符串中”=＆gt;应该匹配整个文本，包括last'“'
'即将转移到换行符\'\ n'“=＆gt;此字符串中有一个\ n字符，但字符串应该匹配从”m“开始到”n“结尾的所有内容。

请有人帮我制定这样的正则表达式。在我看来，我所提供的正则表达式应该可以完成这项工作，但它却没有任何理由失败。

Answer 1

你的正则表达式几乎是正确的，你只需要知道在一个字符类中，句点.只是文字.而不是除了换行符之外的任何字符 。所以：

([^"\\]|\\(.|\n))*\"

或者：

([^"\\]|\\[\s\S])*\"

Answer 2

我认为这会更有效率：

[^"\\]*(\\.[^"\\]*)*\"

Answer 3

我认为你的字符串也以“（你的例子不应该以它开头吗？）

开头。”

Lookaround构造对我来说似乎最自然：

".*?"(?<!\\")

给出输入

"test" test2 "test \a test"  "test \"test" "test\""

这将匹配：

"test"
"test \a test"
"test \"test"
"test\""

正则表达式是：

Match the character “"” literally «"»
Match any single character that is not a line break character «.*?»
   Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “"” literally «"»
Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!\\")»
   Match the character “\” literally «\\»
   Match the character “"” literally «"»

正则表达式匹配具有转义字符的字符串

3 个答案: