卡住了从网站解析表格内容的麻烦

时间:2018-09-22 19:56:48

标签: php web-scraping list-comprehension

我已经用PHP编写了一个脚本来从网页获取表格数据。当我执行脚本时,可以将它们放在一列中。但是,我希望将它们解析为列表,就像它们在该网页中的样子一样。

Website link

要更清楚:

我当前的输出如下:

978
EMU
EUR
1
118.2078
36
Australija
AUD
1
73.1439

我的预期输出如下:

['978', 'EMU', 'EUR', '1', '118.2078']
['36', 'Australija', 'AUD', '1', '73.1439']
['124', 'Kanada', 'CAD', '1', '77.7325']
['156', 'Kina', 'CNY', '1', '14.6565']
['191', 'Hrvatska', 'HRK', '1', '15.9097']

这是我到目前为止的尝试:

<?php
$url = "http://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=lat";
$dom = new DomDocument;
$dom->loadHtmlFile($url);
$xpath = new DomXPath($dom);

$rowData = array();
foreach ($xpath->query('//tbody[@id="index:srednjiKursList:tbody_element"]//tr') as $node) {
    foreach ($xpath->query('td', $node) as $cell) {
        $rowData[] = $cell->nodeValue;
    }
}
foreach($rowData as $rows){
    echo $rows . "<br/>";
}
?>

2 个答案:

答案 0 :(得分:2)

您要一次将每个元素添加到输出数组,您可能想一次建立一行并输出...

$rowData = array();
foreach ($xpath->query('//tbody[@id="index:srednjiKursList:tbody_element"]//tr') as $node) {
    $row = array();
    foreach ($xpath->query('td', $node) as $cell) {
        $row[] = $cell->nodeValue;
    }
    $rowData[] = $row;
}
foreach($rowData as $rows){
    print_r($rows);    // Format the data as needed
}

答案 1 :(得分:1)

尝试一下。

    $htmlContent = file_get_contents("http://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=lat");

    $DOM = new DOMDocument();
    $DOM->loadHTML($htmlContent);

    $Header = $DOM->getElementsByTagName('th');
    $Detail = $DOM->getElementsByTagName('td');

    //#Get header name of the table
    foreach($Header as $NodeHeader) 
    {
        $aDataTableHeaderHTML[] = trim($NodeHeader->textContent);
    }

    //#Get row data/detail table without header name as key
    $i = 0;
    $j = 0;
    foreach($Detail as $sNodeDetail) 
    {
        $aDataTableDetailHTML[$j][] = trim($sNodeDetail->textContent);
        $i = $i + 1;
        $j = $i % count($aDataTableHeaderHTML) == 0 ? $j + 1 : $j;
    }
    //print_r($aDataTableDetailHTML)

    //#Get row data/detail table with header name as key and outer array index as row number
    for($i = 0; $i < count($aDataTableDetailHTML); $i++)
    {
        for($j = 0; $j < count($aDataTableHeaderHTML); $j++)
        {
            @$aTempData[$i][$aDataTableHeaderHTML[$j]] = $aDataTableDetailHTML[$i][$j];
        }
    }
    $aDataTableDetailHTML = $aTempData; unset($aTempData);
    print_r($aDataTableDetailHTML);