如何使用php简单的html dom或Curl从div中废弃HTML标签

时间:2017-07-01 15:02:00

标签: php curl simple-html-dom

这是我想要做的一个例子 例如:

<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>

从上面的emaple我想废弃数组中的数据和标签。 在结果中我想要一个包含以下内容的数组: arr = [h1,p,h2]; 和另一个数组: arr2 = [这是h1,这是段落,这是h2]

4 个答案:

答案 0 :(得分:2)

假设元素已知,您可以使用domdocument&#39; s getelementsbytagname,如下所示:

 usort($string_content_to_explode,function($a,$b){
     if ($a[1] == $b[1])
         return 0;
    return ($a[1] < $b[1])? -1: 1; 
 });


 var_dump($string_content_to_explode);

演示:https://eval.in/825860

答案 1 :(得分:1)

试试这个;

$str = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";

$arr = explode(PHP_EOL, $str);

$res =array();
Foreach($arr as $row){
    If(!strpos($row, "div") !== False){
        $res[substr($row, 1, strpos($row, ">")-1)] = strip_tags($row); 
    }
}

Var_dump($res);

https://3v4l.org/8TkIT

它一直循环一行,并使用命名键创建数组。

编辑如果有多个房间,你可以像这样多维:
https://3v4l.org/DdXVd

$str = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>
<div class='room2'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";

$arr = explode(PHP_EOL, $str);

$res =array();
Foreach($arr as $row){
    If(strpos($row, "div") !== False){
        $pos1 = strpos($row, "'")+1;
        $room = substr($row, $pos1, strpos($row, "'", $pos1)-$pos1);
    }Else{
        $pos1 = strpos($row, "<")+1;
        $res[$room][substr($row, strpos($row, "<")+1, strpos($row, ">")-$pos1)] = trim(strip_tags($row)); 
    }
}

Var_dump($res);

答案 2 :(得分:1)

尝试以下代码。

$html = "<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>";

$dom = new SimpleXMLElement( $html );

$values = array_filter( array_values( (array) $dom ), function ( $i ) { return ! is_array( $i ); } );
$keys = array_filter( array_keys( (array) $dom ), function ( $i ) { return $i != '@attributes'; } );

print_r( $values ); // This is a h1, This is a Paragraph, This is h2
print_r( $keys ); // h1, p, h2

我使用array_filter从结果中删除div标记。

答案 3 :(得分:1)

$str = <<<EOF
<div class='room'>
<h1>This is a h1</h1>
<p>This is a Paragraph</p>
<h2>This is h2</h2>
</div>
EOF;

$html = str_get_html($str);

foreach($html->find('.room *') as $el){
  $arr[] = $el->tag;
  $arr2[] = $el->text();
}