简单的html dom解析器 - 两行合二为一

时间:2015-01-14 22:08:12

标签: php html-table html-parsing simple-html-dom domparser

我试图在数据库中插入一个表,我想在一个数组中转换两行。 谁能帮我吗?

<table>
<tr class="pair"><td>1</td><td>2</td></tr>
<tr class="pair">td<>3</td><td>4</td></tr>
<tr class="unpair"><td>1</td><>2</td></tr>
<tr class="unpair"><td>3</td><td>4</td></tr>
</table>

<?php
require('simple_html_dom.php');
foreach($table->find('tr[class=pair') as $rowpair) {
$rowData = array();
foreach($rowpair->find('td') as $cell) {
$rowData[] = $cell->innertext;
}
foreach($table->find('tr[class=unpair') as $rowunpair) {
$rowData = array();
foreach($rowunpair->find('td') as $cell) {
$rowData[] = $cell->innertext;
}
?>

获取

<table>
<tr class="pair"><td>1</td><td>2</td><td>3</td><td>4</td></tr>
<tr class="unpair"><td>1</td><td>2</td><td>3</td><td>4</td></tr>
</table>

1 个答案:

答案 0 :(得分:0)

这应该可以按类对所有表行进行分组。

基本逻辑是遍历表中的所有行,并确定它是否在之前看过该类。如果没有,它会将对该行的引用存储为要使用的“规范”行。如果之前已经看过这个课程,那么它会将所有孩子转移到规范行。

此方法适用于博客中的任意数量的表和任何类名称。

<?php

    $str = '<table><tr class="pair"><td>1</td><td>2</td></tr><tr class="pair"><td>3</td><td>4</td></tr><tr class="unpair"><td>1</td><td>2</td></tr><tr class="unpair"><td>3</td><td>4</td></tr>
    </table>';


    $doc = new DOMDocument();
    $doc->loadHTML($str);


    $tables = $doc->getElementsByTagName('table');
    foreach ($tables as $table) {

        #For each TR in the table, group into rows
        $table_classes = array();
        $rows = $table->getElementsByTagName('tr');


        $row_list = array();
        foreach ($rows as $row) {
            array_push($row_list, $row);
        }

        for($i=0; $i<count($row_list); $i++){

            $row = $row_list[$i];
            $row_class = $row->getAttribute('class');

            if(!array_key_exists($row_class, $table_classes)){

                #if this is the for occurrence of that clase, store this row as the original_row
                $table_classes[$row_class] = $row;

            }else{

                $original_row = $table_classes[$row_class];

                #Move children over to original row
                foreach ($row->childNodes as $child) {

                    $clone = $child->cloneNode(true);
                    $original_row->appendChild($clone);
                }

                #Now delete original
                $row->parentNode->removeChild($row);


            }
        }

    }


    echo htmlspecialchars($doc->saveXML());

?>

返回:

<table>
    <tr class="pair">
        <td>1</td>

        <td>2</td>

        <td>3</td>

        <td>4</td>
    </tr>

    <tr class="unpair">
        <td>1</td>

        <td>2</td>

        <td>3</td>

        <td>4</td>
    </tr>
</table>