Question

我有一个代码可以在 .html 文件中搜索标记，但我在执行脚本时遇到问题，导致我找到未定义的索引。

在我之前的QUESTION上我询问有关搜索 ID标记的信息，并且我没有引导我将其用作参考。增强代码并正确执行代码，但它显示错误。错误会搜索 .html文件中的每个 id标记

CODE：

<?php
function getElementById($matches)
{
    global $data;
    return $matches[1].$matches[3].$matches[4].$data[$matches[3]].$matches[6];
}

$data['test'] = 'A';

$filename = 'test.html';

$html = file_exists($filename) ? file_get_contents($filename) : die('can\'t open the file');

$_HTML = preg_replace_callback('#(<([a-zA-Z]+)[^>]*id=")(.*?)("[^>]*>)([^<]*?)(</\\2>)#ism', 'getElementById', $html);

echo $_HTML;
?>

HTML：

<html>
    <head>
        <title>TEST</title>
    </head>
    <body>
        <div id="test"></div>
        <div id="test2"></div>
    </body>
</html>

输出：PRINTSCREEN

Answer 1

以下是如何实现默认值：

$data3 = isset($data[$matches[3]]) ? $data[$matches[3]] : 'default';
return $matches[1].$matches[3].$matches[4].$data3.$matches[6];

Answer 2

免责声明：您不应该使用HTML执行所有这些正则表达式的内容，等等......等等...

但如果你坚持

function getElementById($matches)
{
    global $data;
    return $matches[1]
        .$matches[3]
        .$matches[4]
        .isset($data[$matches[3]]) ? $data[$matches[3]] : 'DEFAULT_VALUE'
        .$matches[6];
}

为什么不使用正则表达式？

https://stackoverflow.com/a/1732454/156811

Using regular expressions to parse HTML: why not?

http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

如果您进行快速搜索，我相信您可以找到更多

一些替代方案：

http://us1.php.net/dom

http://simplehtmldom.sourceforge.net/

等