Question

我想创建一个程序来转换HTML代码的正文部分，如下所示：

<html>
    <head>
        <title>Test</title>
    </head>
    <body>
        <h1>TestA</h1>
        <table class="board">
            <tr>
                <td class="other" id="field20">TestB</td>
                <td class="testclass" id="field21">TestC</td>
            </tr>
            <tr>
                <td class="testclass">TestD</td>
                <td class="testclass">TestE</td>
            </tr>
        </table>
    </body>
</html>

这样的事情：

h1 with: 'TestA'.
table class: 'board'; with: [
    tr with: [
        td class: 'other'; id: 'field20'; with: 'TestB'.
        td class: 'testclass'; id: 'field21'; with: 'TestC'.
    ].
    tr with: [
        td class: 'testclass'; with: 'TestD'.
        td class: 'testclass'; with: 'TestE'.
    ].
].

因此，我显然必须使用它的内容，它的属性和属性值来迭代所有HTML标记，尊重它的节点级别。为了实现这一点，我想我需要HTML DOM Parser，但我没有找到一种方法来真正迭代标记，就像我需要它一样。所以，如果有人可以帮助我（可能有完全不同的方式来管理这个），那将是非常棒的！

用PHP重构HTML（简单的HTML DOM解析器）

0 个答案: