我对HTML解析/抓取的整个想法相对较新。我希望我能来这里得到我需要的帮助!
基本上我想做的事情(我认为)是指定我希望从中获取数据的页面的URL。在这种情况下 - http://www.epgpweb.com/guild/us/Caelestrasz/Crimson/
从那里开始,我想在div id = snapshot_table中获取table class = list。
然后,我希望将该表格嵌入到我自己的页面中,并在原始内容更新时进行更新。
我已经阅读了Google和Stackoverflow上的其他一些帖子,我也看了一下关于Nettuts +的教程,但它似乎有点太过分了。
希望有人可以帮助我,尽可能简化:)
干杯,
垫
- 编辑 -
当前代码截至上午11:22(格林尼治标准时间+10)
<?php
# don't forget the library
include('simple_html_dom.php');
?>
<html>
</head>
<body>
<?php
$html = file_get_html('http://www.epgpweb.com/guild/us/Caelestrasz/Crimson/');
$table = $html->find('#snapshot_table table.listing');
print_r($table);
?>
</body>
</html>
答案 0 :(得分:3)
我想我得到了它,我学到了很多东西! :)
<?php
//Get the current timestamp
$url = 'http://www.epgpweb.com/api/snapshot/us/Caelestrasz/Crimson';
$url = file_get_contents($url);
$url = substr($url,-12,10);
//Get the member data based on the timestamp
$url = 'http://www.epgpweb.com/api/snapshot/us/Caelestrasz/Crimson/'.$url;
$url = file_get_contents($url);
//Convert the unicode to html entities, as I found here: http://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-char
function replace_unicode_escape_sequence($match) {
return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}
$url = preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $url);
//erase/replace the insignificant parts, to put the data into an array
function erase($a){
global $url;
$url = explode($a,$url);
$url = implode("",$url);
}
function replace($a,$b){
global $url;
$url = explode($a,$url);
$url = implode($b,$url);
}
replace("[[",";");
replace("]]",";");
replace("],",";");
erase('[');
erase('"');
replace(":",",");
$url = explode(";", $url);
//lose the front and end bits, and maintain the member data
array_shift($url);
array_pop($url);
//put the data into an array
foreach($url as $k=>$v){
$v = explode(",",$v);
foreach($v as $k2=>$v2){
$data[$k][$k2] = $v2;
}
$pr = round(intval($data[$k][1]) / intval($data[$k][2]),3);
$pr = str_pad($pr,5,"0",STR_PAD_RIGHT);
$pr = substr($pr, 0, 5);
$data[$k][3] = $pr;
}
//sort the array by PR number
function compare($x, $y)
{
if ( $x[3] == $y[3] )
return 0;
else if ( $x[3] > $y[3] )
return -1;
else
return 1;
}
usort($data, 'compare');
//output the data into a table
echo "<table><tbody><tr><th>Member</th><th>EP</th><th>GP</th><th>PR</th></tr>";
foreach($data as $k=>$v){
echo "<tr>";
foreach($v as $v2){
echo "<td>".$v2."</td>";
}
echo "</tr>";
}
echo "</tbody></table>";
?>
答案 1 :(得分:1)
看一下PHP simple_html_dom class。
接下来,这将解决问题。
$html = file_get_html('http://www.epgpweb.com/guild/us/Caelestrasz/Crimson/');
$table = $html->find('#snapshot_table table.listing');