您好,我编写以下代码,用XPATH和curl从表中提取名称和价格。
<?php
include_once ("xpath.php");
header('Content-type: text/html; charset=UTF-8');
$ch = curl_init ("http://emalls.ir/%D9%84%DB%8C%D8%B3%D8%AA-%D9%82%DB%8C%D9%85%D8%AA~Category~39~Search~Nokia");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
//$page = curl_exec($ch);
$page = utf8_decode(curl_exec($ch));
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($page);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$data = array();
// get all table rows and rows which are not headers
$produstname = $xpath->query('//table/tbody/tr/td/a/text()');
$produstprice = $xpath->query('//table/tbody/tr/td[8]/text()');
$data = array();
for ($x=0; $x<=1; $x++){
$data[$x]['title'] = $produstname->item($x)->nodeValue;
$data[$x]['price'] = $produstprice->item($x)->nodeValue;
}
?>
以下两个XPATH在chrome上工作以获取名称和价格。
name: $x("//table/tbody/tr/td/a/text()")
price: $x("//table/tbody/tr/td[5]/text()")
但在以下代码中使用时会出现此错误
: Trying to get property of non-object in
答案 0 :(得分:1)
我已经看过该网站了,我谦虚地建议改为定位id=""
属性。你也可以使用foreach。例如:
$ch = curl_init ("http://emalls.ir/%D9%84%DB%8C%D8%B3%D8%AA-%D9%82%DB%8C%D9%85%D8%AA~Category~39~Search~Nokia");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$page = curl_exec($ch);
$page = utf8_decode(curl_exec($ch));
$dom = new DOMDocument('1.0', 'utf-8');
libxml_use_internal_errors(true);
$dom->loadHTML($page);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$data = array();
$table_rows = $xpath->query('//table[@id="grdprice"]/tr'); // target the row (the browser rendered <tbody>, but actually it really doesnt have one)
if($table_rows->length <= 0) { // exit if not found
echo 'no table rows found';
exit;
}
foreach($table_rows as $tr) { // foreach row
$row = $tr->childNodes;
if($row->item(0)->tagName != 'th') { // avoid headers
$data[] = array(
'name' => trim($row->item(0)->nodeValue),
'price' => trim($row->item(7)->nodeValue),
);
}
}
echo '<pre>';
print_r($data);