我使用函数__()
来翻译字符串,我添加了一个界面来自动查找所有文件中的所有论文翻译。这是(应该)使用以下正则表达式完成的:
<?php
$pattern = <<<'LOD'
`
__\(
(?<quote> # GET THE QUOTE
(?<simplequote>') # catch the opening simple quote
|
(?<doublequote>") # catch the opening double quote
)
(?<param1> # the string will be saved in param1
(?(?=\k{simplequote}) # if condition "simplequote" is ok
(\\'|"|[^'"])+ # allow escaped simple quotes or anything else
| #
(\\"|'|[^'"])+ # allow escaped double quotes or anything else
)
)
\k{quote} # find the closing quote
(?:,.*){0,1} # catch any type of 2nd parameter
\)
# modifiers:
# x to allow comments :)
# m for multiline,
# s for dotall
# U for ungreedy
`smUx
LOD;
$files = array('/path/to/file1',);
foreach($files as $filepath)
{
$content = file_get_contents($filepath);
if (preg_match_all($pattern, $content, $matches))
{
foreach($matches['param1'] as $found)
{
// do things
}
}
}
正则表达式不适用于包含转义简单引号(\'
)的某些双引号字符串。事实上,无论字符串是简单的还是双引号,条件都被认为是假的,所以总是使用“else”。
<?php
// content of '/path/to/file1'
echo __('simple quoted: I don\'t "see" what is wrong'); // do not work.
echo __("double quoted: I don't \"see\" what is wrong");// works.
对于file1,我希望找到两个字符串,但只有双引号
编辑添加了更多php代码,以便于测试
答案 0 :(得分:3)
使用以下正则表达式并从组索引2中获取所需的字符串。
__\((['"])((?:\\\1|(?!\1).)*)\1\)
<强>解释强>
__\(
匹配文字__(
个字符。
(['"])
捕获以下双引号或单引号。
(?:\\\1|(?!\1).)*
匹配转义的双引号或单引号(引号基于组索引1 中的字符)或|
不符合内部字符捕获组(?!\1).
零次或多次。
\1
指的是第一个捕获组中的字符。
答案 1 :(得分:0)
Avinash Raj的解决方案更优雅,可能更有效(所以我验证了它),但我发现了我的错误,所以我在这里发布解决方案:
<?php
$pattern = <<<'LOD'
`
__\(
(?<quote> # GET THE QUOTE
(?<simplequote>') # catch the opening simple quote
|
(?<doublequote>") # catch the opening double quote
)
(?<param1> # the string will be saved in param1
(?(simplequote) # if condition "simplequote"
(\\'|[^'])+ # allow escaped simple quotes or anything else
| #
(\\"|[^"])+ # allow escaped double quotes or anything else
)
)
\k{quote} # find the closing quote
(?:,.*){0,1} # catch any type of 2nd parameter
\)
# modifiers:
# x to allow comments :)
# m for multiline,
# s for dotall
# U for ungreedy
`smUx
LOD;