我有以下字符串,需要提取div中的文本(编辑器前言,更多内容等)并将它们放入带有php的数组中。我怎么能这样做?
提前致谢。
<div class='classit'><a href='site.php?site=1&filename=aname4'>EDITOR'S PREFACE</a></div>
<div class='classit'><a href='site.php?site=4&filename=aname3'>MORE CONTENT</a></div>
<div class='classit'><a href='site.php?site=3&filename=aname4'>LAST LINE</a></div>
答案 0 :(得分:3)
$html = <<<HTML
<div class='classit'><a href='site.php?site=1&filename=aname4'>EDITOR'S PREFACE</a></div>
<div class='classit'><a href='site.php?site=4&filename=aname3'>MORE CONTENT</a></div>
<div class='classit'><a href='site.php?site=3&filename=aname4'>LAST LINE</a></div>
HTML;
$src = str_get_html($html);
$elem = $src->find("div.classit a");
foreach ($elem as $link) {
$links[] = $link->plaintext;
}
print_r($links);
答案 1 :(得分:1)
您可以使用PHP自己的DOM扩展
$string = '<div><a>Elem 1</a></div><div><a>Elem 2</a></div>...etc';
$dom = new DOMDocument();
$dom->loadHTML($string);
$elements = $dom->getElementsByTagName('a');
$textElements = array();
foreach($elements as $node) {
textElements[] = $node->nodeValue;
}
如果要加载更大的HTML提取,可以使用DOMXPath查询DOMDocument,以便获得所需的元素。
$xPathObj = new DOMXPath($dom);
$elements = $xPathObj->query('//div[@class='classit']/a');
修改
DOMNodeList支持foreach,所以我将for($i = 0; $i < $elements->length; $i++) {$elements->item($i)->nodeValue;}
更改为foreach($elements as $node) {$node->nodeValue}
答案 2 :(得分:0)
你可以使用strip_tags
:
$s = "<div class='classit'><a href='site.php?site=1&fn=aname4'>EDITOR'S PREFACE</a></div>
<div class='classit'><a href='site.php?site=4&filename=aname3'>MORE CONTENT</a></div>
<div class='classit'><a href='site.php?site=3&filename=aname4'>LAST LINE</a></div> ";
foreach (explode("\n", $s) as $val){
$new[] = strip_tags($val);
}
var_dump($new);
答案 3 :(得分:0)
您可以使用preg_match_all:
<?php
$html = <<<HTML
<div class='classit'><a href='site.php?site=1&filename=aname4'>EDITOR'S PREFACE</a></div>
<div class='classit'><a href='site.php?site=4&filename=aname3'>MORE CONTENT</a></div>
<div class='classit'><a href='site.php?site=3&filename=aname4'>LAST LINE</a></div>
HTML;
$result = array();
if (preg_match_all('/>([^><]+)(?=<\/a>)/', $html, $matches))
{
$result = $matches[1];
}
print_r($result);