如何使用symfony dom crawler将html表解析为数组

时间:2016-06-28 01:25:40

标签: php arrays symfony domcrawler

我有html表,我想从该表中创建数组

$html = '<table>
<tr>
    <td>satu</td>
    <td>dua</td>
</tr>
<tr>
    <td>tiga</td>
    <td>empat</td>
</tr>
</table>

我的数组必须如下所示

array(
   array(
      "satu",
      "dua",
   ),
   array(
     "tiga",
     "empat",
   )
)

我已尝试过以下代码,但无法获取我需要的数组

$crawler = new Crawler();
$crawler->addHTMLContent($html);
$row = array();
$tr_elements = $crawler->filterXPath('//table/tr');
foreach ($tr_elements as $tr) {
 // ???????
}

2 个答案:

答案 0 :(得分:11)

$table = $crawler->filter('table')->filter('tr')->each(function ($tr, $i) {
    return $tr->filter('td')->each(function ($td, $i) {
        return trim($td->text());
    });
});

print_r($table);

上面的例子将给你一个多维数组,其中第一层是表格行“tr”,第二层是表格列“td”。

修改

如果你有嵌套表,这段代码会很好地将它们展平成单维数组。

$html = 'MY HTML HERE';
$crawler = new Crawler($html);

$flat = function(string $selector) use ($crawler) {
    $result = [];
    $crawler->filter($selector)->each(function ($table, $i) use (&$result) {
        $table->filter('tr')->each(function ($tr, $i) use (&$result) {
            $tr->filter('td')->each(function ($td, $i) use (&$result) {
                $html = trim($td->html());
                if (strpos($html, '<table') !== FALSE) return;

                $iterator = $td->getIterator()->getArrayCopy()[0];
                $address = $iterator->getNodePath();

                if (!empty($html)) $result[$address] = $html;
            });
        });
    });
    return $result;
};

// The selector gotta point to the most outwards table.
print_r($flat('#Prod fieldset div table'));

答案 1 :(得分:7)

$html = '<table>
            <tr>
                <td>satu</td>
                <td>dua</td>
            </tr>
            <tr>
                <td>tiga</td>
                <td>empat</td>
            </tr>
            </table>';

    $crawler = new Crawler();
    $crawler->addHTMLContent($html);
    $rows = array();
    $tr_elements = $crawler->filterXPath('//table/tr');
    // iterate over filter results
    foreach ($tr_elements as $i => $content) {
        $tds = array();
        // create crawler instance for result
        $crawler = new Crawler($content);
        //iterate again
        foreach ($crawler->filter('td') as $i => $node) {
           // extract the value
            $tds[] = $node->nodeValue;

        }
        $rows[] = $tds;

    }
    var_dump($rows );exit;

将显示

array 
  0 => 
    array 
      0 => string 'satu' 
      1 => string 'dua' 
  1 => 
    array (size=2)
      0 => string 'tiga' 
      1 => string 'empat'