php strip iframes和scripts标签(没有htmlentities())

时间:2012-02-15 08:45:45

标签: php regex iframe strip

我正在处理xml文件,有时会有     “I帧”  和     “脚本” 标签我需要离开,在我甚至'xml-parse-it'之前

我正在尝试一些正则表达式,但我弄错了! :(

测试字符串:

      $teststring = 'p><iframe src="http://www.facebook.com/plugins/like.php?href=abcdef&layout=standard&show_faces=false&width=450&action=like&colorscheme=dark&height=35" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:450px; height:35px;" allowtransparency="true"></iframe></p>';

 //todo clean this up// found this function on net. //more legacy stufff
    $Rules = array(
         '@<script[^>]*?>.*?</script>@si', // Strip out javascript                
        '@&(cent|#162);@i', //   Cent 
        '@&(pound|#163);@i', //   Pound
        '@&(copy|#169);@i', //   Copyright
        '@&(reg|#174);@i', //   Registered
        '@&#(d+);@e', // Evaluate as php
---> PROBLEM--> '@&lt;iframe [^&lt;]&lt;.*?&lt;\/iframe&gt;@i',

    );

    $Replace = array(
         '',
        chr( 162 ),
        chr( 163 ),
        chr( 169 ),
        chr( 174 ),
        'chr()',
        '',

    );
        //expecting <p></p>
    $data = preg_replace( $Rules, $Replace, $teststring);


            echo $data;

1 个答案:

答案 0 :(得分:1)

试试这个

'@&lt;iframe(?:(?!&gt;).)*&gt;.*?&lt;\/iframe&gt;@i'