我正在使用这篇文章How to get content from another page的例子,但我需要得到" SUPERMAN"来自这种格式的网站:
<td headers="superHero">SUPERMAN</td>
<td headers="country">USA</td>
代码:
$url = "http://www.otherweb.com";
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$output = curl_exec($curl);
curl_close($curl);
$DOM = new DOMDocument;
$DOM->loadHTML( $output);
//get all td
//$items = $DOM->getElementsByTagName('td');
$items = $DOM->getElementsByID('superHero');
//display all text
for ($i = 0; $i < $items->length; $i++)
echo $items->item($i)->nodeValue . "<br/>";
感谢!!!
答案 0 :(得分:1)
首先,您可以跳过卷曲部分。 DOMDocument
使用方法loadHTMLFile()
加载甚至远程html文件。只需使用:
$DOM = new DOMDocument();
$DOM->loadHTMLFile($url);
// If the remote page might not being valid against HTML standards,
// you might want to use the "silence operator" : @
@$DOM->loadHTMLFile($url);
如果要按其属性值选择元素,请使用XPath:
$selector = new DOMXPath($DOM);
$element = $selector->query('//td[@headers="superHero"]')->item(0);