这里是我在PHP中的$ data变量中的html字符串,以及该字符串 有一些像
<140/90 mmHg OR <130/80 mmHg
这样的文字不行 当我使用PHPDOMDocument
运行此代码时显示,因为当来到的时间小于&amp;更重要的是签署它的问题。
<?php
$data = 'THE CORRECT ANSWER IS C.
<p>Choice A Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s</p>
<p></p>
<p>Choice B Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s</p>
<p>Choice D Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s</p>
<p></p>
<p>Choice E simply dummy text of the printing and typesetting industry.</p>
<p></p>
<p><br>THIS IS MY MAIN TITLE IN CAPS<br>This my sub title.</p>
<p><br>TEST ABC: Lorem Ipsum is simply dummy text of the printing and typesetting industry.</p>
<p>1) It is a long established fact <140/90 mmHg OR <130/80 mmHg making it look like readable English will uncover many web sites still in their infancy.
<br><br>2) There are many variations of passages of Lorem Ipsum available. </p>
<p><br>TEST XYZ: Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.</p>
<p><br>TES T TEST: It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.</p>
<p><br>TESTXXX: It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>';
echo boldFormatExplanation($data);
?>
另外,我创建了以下PHP函数,它将转换粗体标题 并使用PHP
DOMDocument
加粗一些单词。
- 标题用粗体:&#34;这是我的主要标题&#34; (标题并不总是相同)
- 粗体字:TEST ABC:,TEST XYZ:,TES T TEST:,TESTXXX :(这个词总是一样的)
醇>以上2点运作良好,只是缺少我的线路 如上所述在第一个区块中。
<?php
function boldFormatExplanation($data){
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->encoding = 'utf-8';
$dom->substituteEntities = false;
$dom->preserveWhiteSpace = true;
$internalErrors = libxml_use_internal_errors(true);// Set error level
@$dom->loadHTML($data, LIBXML_HTML_NODEFDTD);// Load html
libxml_use_internal_errors($internalErrors);// Restore error level
$xpath = new DOMXPath($dom);// Dom xpath
$title_flag = true;
foreach($xpath->query('//text()') as $node) {
$txt = trim($node->nodeValue);
$p = $node->parentNode;
if (preg_match("/^\s*(TEST ABC:|TEST XYZ:|TES T TEST:|TESTXXX)(.*)$/s", $node->nodeValue, $matches)) {
// Put Choice in bold:
$p->insertBefore($dom->createElement('b', $matches[1]), $node);
$node->nodeValue = " " . trim($matches[2]);
} else
if (strtoupper($txt) === $txt && $txt !== '') {
// Put header in bold
if($title_flag == true){
$p->insertBefore($dom->createElement('b', $txt), $node);
$node->nodeValue = "";
$title_flag = false;
}
}
}
$domData = $dom->saveHTML();
$data = htmlspecialchars_decode($domData);
return $data;
} ?>
您可以在here运行此代码,同时跳过此行的输出<140/90 mmHg OR <130/80 mmHg
答案 0 :(得分:1)
您在这里没有选择,您需要在使用DOMDocument::loadHTML
加载字符串之前处理该字符串。但你不能像一个盲目替换的野蛮人那样做(因为在这种情况下,<
或script
标签之间的style
也会被替换。。您需要使用libxml错误来仅查找有问题的打开尖括号。你可以这样做(它不是很快,因为你需要构建DOM树,直到错误消失,但它是正确的):
define('LIBXML_ERR_NAME_REQUIRED', 68);
$skeleton = '<html><head><meta charset="UTF-8"/></head><body id="root">%s</body></html>';
$htmlDoc = sprintf($skeleton, $data);
$dom = new DOMDocument;
do {
libxml_use_internal_errors(true);
$hasError = false;
$dom->loadHTML($htmlDoc);
$errors = libxml_get_errors();
foreach ($errors as $error) {
if ($error->code == LIBXML_ERR_NAME_REQUIRED) {
$hasError = true;
$htmlDoc = preg_replace('~\A(?:.*\R){' . ($error->line - 1) . '}.{' . ($error->column - 2) . '}\K<~u', '<', $htmlDoc);
}
}
libxml_clear_errors();
} while ($hasError);
boldFormatExplanation($dom);
foreach($dom->getElementById('root')->childNodes as $childNode) {
echo $dom->saveHTML($childNode);
}
顺便说一下,当你使用DOMDocument::loadHTML
之后设置DOMDocument编码属性是没用的,因为编码是用文档内容设置的(这是我给自己设置一个html的主要原因) $data
周围的骨架<meta charset="UTF-8"/>
)。
关于你的粗体功能,你可以这样写:
function boldFormatExplanation(&$dom) {
$xpath = new DOMXPath($dom);
$title_flag = true;
foreach($xpath->query('//text()') as $node) {
$txt = trim($node->nodeValue);
if (empty($txt)) continue;
$p = $node->parentNode;
if (preg_match("/^(TEST ABC:|TEST XYZ:|TES T TEST:|TESTXXX)\s*(.*)/s", $txt, $matches)) {
// Put Choice in bold:
$p->insertBefore($dom->createElement('b', $matches[1]), $node);
$node->nodeValue = " " . $matches[2];
} elseif ($title_flag && strtoupper($txt) === $txt) {
// Put header in bold
$p->replaceChild($dom->createElement('b', $txt), $node);
$title_flag = false;
}
}
}