我想使用PHP删除表中的所有<br />
。我知道我可以使用str_replace()
删除<br />
。但它将删除所有<br />
。我只想删除<br />
和<table>
之间的</table>
。我在一个字符串中有几个表。
html代码如下。您也可以看到this fiddle。
<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>
<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>
我尝试了以下方法来做到这一点,这是最好的解决方案吗?
<?php
$input = '<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>
<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>';
$body = preg_replace_callback("~<table\b.*?/table>~si", "process_table", $input);
function process_table($match) {
return str_replace('<br />', '', $match[0]);
}
echo $body;
答案 0 :(得分:1)
如上所述here,&#34;正则表达式不是可用于正确解析HTML&#34;的工具。但是,为了给出一个可以解决这个受控案例的解决方案,我提交以下内容。它包括显示之前和之后的调试代码。
注意:我还测试了您的正则表达式,它与/<table\b.*?<\/table>/si
preg_match()
一样有效
<?php
$search ='<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>
<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>';
$search = replacebr($search);
function replacebr($search){
$offset=0;
$anew=array();
$asearch=array();
$notdone = 1;
$i=0;
echo $search;
while ($notdone == 1) {
($notdone = preg_match('/<table\s[^>]*>(.+?)<\/table>/', $search, $amatch, PREG_OFFSET_CAPTURE, $offset));
if (count($amatch)>0){
echo "amatch: " ; var_dump($amatch);
// add part before match
$anew[] = substr($search,$offset,$amatch[0][1]-$offset);
echo "anew (before): " ; var_dump($anew[count($anew)-1]);
// add match with replaced text
$anew[] = str_replace("<br />","",$amatch[0][0]);
echo "anew (match): " ; var_dump($anew[count($anew)-1]);
$offset += mb_strlen(substr($search,$offset,$amatch[0][1]-$offset))+ mb_strlen($amatch[0][0]);
echo "OFFSET: " ; var_dump($offset);
}
else{
// add last part of string - we better be done
$anew[] = substr($search, $offset);
$search=="";
if ($notdone == 1){
die("Error - should be done");
}
}
if ($i==100){
// prevent endless loop
die("Endless Loop");
}
$i++;
}
$new = implode("",$anew);
echo "*******************";
echo $new;
return $new;
}
?>
答案 1 :(得分:0)
不建议使用正则表达式解析html,但是如果必须使用 这可能会奏效。
注意 - 测试用例是在perl中,但正则表达式可以在php中使用
只需全局替换为$1
# '~(?s)((?:(?!\A|<table\b)\G|<table\b)(?:(?!<br\s*/>|</table\b).)*)<br\s*/>(?=.*?</table\b)~'
(?s) # Dot-All
( # (1 start), Keep these
(?:
(?! \A | <table \b )
\G # Start match from end of last match
| # or,
<table \b # Start form '<table\b'
)
(?: # The chars before <br/ or </table end tags
(?!
<br \s* />
| </table \b
)
.
)*
) # (1 end)
<br \s* /> # Strip <br/>
(?= .*? </table \b ) # Must be </table end tag downstream
Perl测试用例
$/ = undef;
$str = <DATA>;
print "Before:\n$str\n\n";
$str =~ s~(?s)((?:(?!\A|<table\b)\G|<table\b)(?:(?!<br\s*/>|</table\b).)*)<br\s*/>(?=.*?</table\b)~$1~g;
print "After:\n$str\n\n";
__DATA__
<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>
输出&gt;&gt;
Before:
<p>Some text before table:</p><table cellpadding="0" cellspacing="0"><br /> <tbody><br /> <tr><br /> <td><br /> <p><strong>column1</strong></p> </td><br /> <td><br /> <p><strong>column2</strong></p> </td></tr><br /> <tr><br /> <td><br /> <p>1</p> </td><br /> <td><br /> <p>2</p> </td><br /> <br /> </tr><br /> </tbody><br /></table>
After:
<p>Some text before table:</p><table cellpadding="0" cellspacing="0"> <tbody> <tr> <td> <p><strong>column1</strong></p> </td> <td> <p><strong>column2</strong></p> </td></tr> <tr> <td> <p>1</p> </td> <td> <p>2</p> </td> </tr> </tbody></table>