Question

需要解析大量文本文件并替换任何包含西里尔符号的带引号的字符串。它们可能包含新行，非字母字符和特殊符号（例如'$'或转义引号）。任何人都可以帮助正则表达式吗？

来自评论：

例如php代码

function hello($word) {
    $word2 = "ха-ха!";
    echo "Привет, $word $word2\n"; 
}
hello('Мир');

我需要匹配“ха-ха！”，“Привет，$ word $ word2 \ n”和“Мир”

Answer 1

这应该有效：

str = 'The cat is under the "таблица"'
regex = /"\p{Cyrillic}+.*?\.?"/ui

str.match(regex){|s| do_stuff_with_each_matching s} 

# or...

str.gsub!(regex){|s| method_that_translates_russian s}

在http://rubular.com/r/0Mwbfinjvp实时查看 http://www.ruby-doc.org/core-1.9.3/Regexp.html

Answer 2

".*[^a-zA-Z\d]+.*"匹配任何引用的字符序列，其中包含至少一个非字母数字字符。

即。它与"aa$bb"和"a1$b1"

相匹配

与"aabb"或a$b不符。

希望这是你想要的（添加所需的转义）。

如何匹配包含西里尔符号的任何带引号的字符串

2 个答案: