这是我从表中提取数据的代码。
但我想删除链接。
以及如何将标题和价格分组到数组中。
<?php
$ch = curl_init ("http://www.digionline.ir/Allprovince/CategoryProducts/cat=10301");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$page = curl_exec($ch);
preg_match('#<table[^>]*>(.+?)</table>#is', $page, $matches);
foreach ($matches as &$match) {
$match = $match;
}
echo '<table>';
echo $match ;
echo '</table>';
?>
答案 0 :(得分:3)
我建议改用HTML Parser。使用DOMDocument
+ DOMXpath
,无需安装它们已经内置。例如:
$ch = curl_init ("http://www.digionline.ir/Allprovince/CategoryProducts/cat=10301");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$page = curl_exec($ch);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($page);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$data = array();
// get all table rows and rows which are not headers
$table_rows = $xpath->query('//table[@id="tbl-all-product-view"]/tr[@class!="rowH"]');
foreach($table_rows as $row => $tr) {
foreach($tr->childNodes as $td) {
$data[$row][] = preg_replace('~[\r\n]+~', '', trim($td->nodeValue));
}
$data[$row] = array_values(array_filter($data[$row]));
}
echo '<pre>';
print_r($data);
$data
应包含:
Array
(
[0] => Array
(
[0] => AMDA4-3400
[1] => 1,200,000
[2] => 1,200,000
)
[1] => Array
(
[0] => AMDSempron 145
[1] => 860,000
[2] => 910,000
)
答案 1 :(得分:0)
如果要解析某些网络资源,可以使用PHP Simple HTML DOM Parser
如果你想获得一张桌子和桌子内的所有链接:
$html = file_get_html('http://www.digionline.ir/Allprovince/CategoryProducts/cat=10301');
$table = $html->find('table');
$links = $table->find('a');
echo $table;