当字符串变量以digit开头时,无法替换标记中的值

时间:2016-01-08 17:51:20

标签: php regex string preg-replace

我有一些代码,如果$subtitle1的值只包含字母或空格,则正则表达式替换可以正常工作。当$subtitle1字符串以数字开头时(例如"第3版"),preg_replace函数意外工作。如果我在替换字符串中添加一个空格,那么$ subtitle1的值可以从一个数字开始,然后确定,但是它会在#3;第三版"中的3之前放置一个不需要的空格。

$raw_xml    = '<property name="subtitle1" type="String">Linux is more than a shell</property>';
$subtitle1  = '3rd Edition';

$replacers  = array (
    '/(<property name="subtitle1" type="String">)([1-9A-Za-z ]+)(<\/property>)/'  => sprintf("$1%s$3",$subtitle1), //1
    '/(<property name="subtitle1" type="String">)([1-9A-Za-z ]+)(<\/property>)/'  => sprintf("$1 %s$3",$subtitle1), //2
    '/(<property name="subtitle1" type="String">)([1-9A-Za-z ]+)(<\/property>)/'  => sprintf("$1%s$3",$subtitle1), //3
);
echo preg_replace(array_keys($replacers), array_values($replacers), $raw_xml);        

//1 (when $subtitle1 = 'Third Edition', outputs: <property name="subtitle1" type="String">Third Edition</property>)
//2 (when $subtitle1 = '3rd Edition', outputs: <property name="subtitle1" type="String"> 3rd Edition</property>)
//3 (when $subtitle1 = '3rd Edition', outputs: rd Edition</property>)

如果$subtitle1 var的类型始终是字符串,那么我可以做些不同的事情来使其工作相同吗?我已经尝试过修饰符s,U,但没有进一步。感谢您对此的任何见解。

2 个答案:

答案 0 :(得分:2)

在纯理论平面上,您的代码不起作用,因为解析器在sprintf或者字符串评估字符串之前搜索反向引用 $1$3作为变量 pcre 正则表达式引擎。

所以要使其工作只需替换sprintf文字字符串部分:

sprintf("$1%s$3",$subtitle1) -> sprintf('${1}%s${3}',$subtitle1)
# Note the change of $1 -> ${1} to clearly delimit the backreference
# and the use of single quote string '...' instead of  "..." 
# (inside double quotes any $ start an evaluation as variables of string beside)

但是对于一个可靠的解决方案避免用正则表达式解析xml 并使用这样的专用(简单而强大)解析器:

<?php
$xml = <<<XML
<properties> <!-- Added -->
    <property name="subtitle1" type="String">Linux is more than a shell</property>
</properties>
XML;

$properties = new SimpleXMLElement($xml);
$properties->property[0] = '3rd Edition';

echo $properties->asXML(); //Only the first is changed

详情请见Official Docs

答案 1 :(得分:1)

问题是因为:sprintf("$1%s$3",$subtitle1)

输出:$13rd Edition$3

我想正则表达式引擎将其理解为第13个捕获组。

好消息是,我为你找到了解决方案。

替换:$subtitle1 = '3rd Edition';

通过:$subtitle1 = '>3rd Edition<';

并提取&lt;&gt;来自你的第一和第三个捕获组就像这样。

$replacers  = array (
    '/(<property name="subtitle1" type="String")>([1-9A-Za-z ]+)<(\/property>)/'  => sprintf("$1%s$3",$subtitle1), //1
    '/(<property name="subtitle1" type="String")>([1-9A-Za-z ]+)<(\/property>)/'  => sprintf("$1 %s$3",$subtitle1), //2
    '/(<property name="subtitle1" type="String")>([1-9A-Za-z ]+)<(\/property>)/'  => sprintf("$1%s$3",$subtitle1), //3
);

您可以在此处进行测试:http://sandbox.onlinephpfunctions.com/code/05bf9a209bdcd6622bf494dc7f4887660e7a93a0