将第三方表数据拉入array / json

时间:2017-09-25 23:03:54

标签: php arrays json curl xpath

我正在尝试使用PHP将此外部表数据转换为数组/ JSON。我能够使用XPath和计数td等来做到这一点,然而,数据每周更改一次并将所有内容搞砸...是否有一种很好的方法来提取此信息并使用条件语句根据播放器显示适当的值名称?以下是表格See Here

的链接

我想得到像

这样的东西
Player name:
    GAMES:   
    MPR:
    PPR:
Player name:
    GAMES:   
    MPR:
    PPR:
etc...

如果有人能帮助我或者指出我正确的方向,我会非常感激!这让我疯狂,我甚至会在必要时付钱。

谢谢!

这是我目前的代码

$urll = 'http://www.leagueleader.net/sharedreport.php?operatorid=98&code=1928e435-8dbe-450f-8bca-74f603f892f0';

$options = array (
    CURLOPT_RETURNTRANSFER => true,     // return web page
    CURLOPT_HEADER         => false,    // don't return headers
    CURLOPT_FOLLOWLOCATION => true,     // follow redirects
    CURLOPT_ENCODING       => "",       // handle all encodings
    CURLOPT_USERAGENT      => "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0", // something like Firefox 
    CURLOPT_AUTOREFERER    => true,     // set referer on redirect
    CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect
    CURLOPT_TIMEOUT        => 120,      // timeout on response
    CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects
);

$curl = curl_init($urll);
curl_setopt_array( $curl, $options );
$content = curl_exec($curl);
curl_close($curl);
$dom = new DOMDocument();
@$dom->loadHTML($content);
$xpath = new DOMXPath($dom); 

$tabless = $dom->getElementsByTagName('tbody'); 
$rows = $tabless->item(0)->getElementsByTagName('tr');

foreach ($rows as $roww) 
{ 

$colss = $roww->getElementsByTagName('td');

//$player = $cols->item(0)->nodeValue; $pstats[$i]['player'] = trim($player);
//$percentage = $cols->item(1)->nodeValue; $pstats[$i]['gamesplayed'] = trim($percentage);
$cricket = $colss->item(2)->nodeValue; $pstats[$j]['cricket'] = trim($cricket);
$o1 = $colss->item(3)->nodeValue; $pstats[$j]['01'] = trim($o1);


$j++;
} 

1 个答案:

答案 0 :(得分:0)

你并不是说在DOM中有什么变化,所以很难让'#34;总是有效的"溶液

这是一个解决方案,分两个阶段解析结果。第一阶段从表中获取数据,然后第二阶段需要至少4个元素或继续。如果它再次发生变化,它应该很容易调试。

<?php
$doc = new DOMDocument();
$doc->loadHTML(file_get_contents('...'));
$doc->strictErrorChecking = false;

$pre = [];
foreach ($doc->getElementsByTagName('table') as $table) {
    foreach ($table->getElementsByTagName('tr') as $i => $tr) {
        $y = 0;
        foreach ($tr->childNodes as $td) {
            $text = trim($td->nodeValue);

            if ($y > 7) {
                unset($pre[$i]);
                continue;
            }

            if (empty($text)) {
                continue;
            }

            $pre[$i][] = $text;
            $y++;
        }
    }
}

// normalise
$result = [];
foreach ($pre as $row) {
    if (count($row) != 4 || $row[0] == 'Team Totals:') {
        continue;
    }

    if (!is_numeric($row[1]) || !is_numeric($row[2]) || !is_numeric($row[3])) {
        // looks broke again, send email to developer ;p
        continue;
    }

    $result[$row[0]] = [
        'name' => $row[0],
        'games' => $row[1],
        'mpr' => $row[2],
        'ppd' => $row[3]
    ];
}

echo '<pre>'.print_r($result, true).'</pre>';

/*
Array
(
    ['Lawrence Cherone'] => Array
        (
            [name] => Lawrence Cherone
            [games] => 51
            [mpr] => 5.00
            [ppd] => 67.48
        )

    ['Scott Sandberg'] => Array
        (
            [name] => Scott Sandberg
            [games] => 51
            [mpr] => 4.02
            [ppd] => 33.18
        )

*/
?>

根据结果构建表格:

<table>
    <thead>
        <tr>
            <?php foreach (array_values($result)[0] as $key => $row): ?>
            <th><?= ucfirst($key) ?></th>
            <?php endforeach ?>
        </tr>
    </thead>
    <tbody>
        <?php foreach ($result as $key => $row): ?>
        <tr>
            <?php foreach ($row as $row): ?>
            <td><?= $row ?></td>
            <?php endforeach ?>
        </tr>
        <?php endforeach ?>
    </tbody>
</table>

或访问各个玩家的统计信息:

<?= $result['Scott Sandberg']['games'] ?>

希望它有所帮助。