剥离PHP标签preg_replace

时间:2012-07-17 23:37:27

标签: php preg-replace

我想从外部文本中删除所有php标记,以便它可以安全地包含在php中。

这是示例输入:

<?
?>
<html>
<?php ?>
<?= ?>
</html>
<?

或任何其他可能性

并输出:

<html>
</html>

最后一个php open标签可能没有结束标签!

2 个答案:

答案 0 :(得分:3)

我认为没有一种很好的方法可以完全按照您的要求进行操作,但如果在输出中发送PHP标记(未解析)是可以接受的,那么您可以使用:

<?php echo file_get_contents('input.html'); ?>

否则,请查看token_get_all方法:

http://www.php.net/manual/en/function.token-get-all.php

您可以迭代所有结果,只返回T_INLINE_HTML类型的结果:

$toks = token_get_all( file_get_contents( 'input.html' ) );
foreach( $toks as $tok ) {
  if( $tok[0] == T_INLINE_HTML )   {
    print $tok[1];
  }
}

答案 1 :(得分:2)

执行此操作的正确方法是不包含它,而是使用file_get_contents()将其作为字符串加载。这将保留PHP标记而不执行它们。但是,以下正则表达式将完全按照您的要求执行:

#<\?.*?(\?>|$)#s

以下是该字符串代表的细分:

#       A delimiter marking the beginning and end of the expression - nearly anything will do (preferably something not in the regex itself)
<\?      Find the text "<?", which is the beginning of a PHP tag.  Note that a backslash before the question mark is needed because question marks normally do something special in regular expressions.
.*?     Include as much text as necessary (".*"), but as little as possible ("?").
(\?>|$)  Stop at an ending PHP tag ("?>"), OR the end of the text ("$").  This doesn't necessarily have to stop at the first one, but since the previous part is "as little as possible", it will.
#       The same delimiter, marking the end of the expression
s       A special flag, indicating that the pattern can span multiple lines.  Without it, the regex would expect to find the entire PHP tag (beginning and end) on a single line.