网址中的html代码段(www.foo.com/index.html):
...
<th class="name" align="left" scope="col">
<a class="foo" href="foo.html">foo</a>
</th>
...
<th class="name" align="left" scope="col">
<a class="bar" href="bar.html">bar</a>
</th>
...
<th class="name" align="left" scope="col">
<a class="ba" href="baz.html">baz</a>
</th>
......
我想通过php获取类.name
中的所有文本并将其转换为JSON
所以最终结果如下:
{"names":["foo","bar","baz"]}
这就是我的尝试:
function linkExtractor($html){
$nameArr = array();
$doc = new DOMDocument();
$doc->loadHTML($html);
$names = //how do i get the elements?
foreach($names as $name) {
array_push($nameArr, $name);
}
return $imageArr;
}
echo json_encode(array("names" => linkExtractor($html)));
答案 0 :(得分:2)
试试这个......
$html = "http://www.foo.com/index.html"; //is this right?
function linkExtractor($html, $classname){
$nameArr = array();
$doc = new DOMDocument();
$doc->loadHTML($html);
$names = $doc->xpath("//*[@class='" . $classname . "']");
foreach($names as $name) {
array_push($nameArr, $name);
}
return $imageArr;
}
echo json_encode(array("names" => linkExtractor($html,".name")));
答案 1 :(得分:0)
所以这就结束了:
$names = function($html) {
$doc = new DOMDocument();
$last = libxml_use_internal_errors(TRUE);
$doc->loadHTML($html);
libxml_use_internal_errors($last);
$xp = new DOMXPath($doc);
$result = array();
foreach ($xp->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' name ')]") as $node)
$result[trim($node->textContent)] = 1;
return array_keys($result);
};
echo json_encode(array("names" => $names($html)));
输出:
{"names":["foo","bar","baz"]}
必需的PHP版本:5.3 +