Question

我想知道找到属于HTML文档的短语/单词的可靠方法。例如，如果我有以下文件：

<a href="#">This is a test</a><b>Another test</b>

我的目标是找到“这是一个测试”和“另一个测试”，并用其他东西替换它。请注意，这些是示例短语，可能包含数字或符号符号。

任何帮助都会很棒。

谢谢

Answer 1

将您的HTML视为XML并使用DOM（PHP 5）或DOM XML（PHP 4）扩展（或PHP中包含的任何其他XML扩展）。

对于每个节点，您可以使用DomNode.GetValue获取内部文本（取决于您使用的库）。

Answer 2

你可以使用php的strip_tags($string, $tagsToRemove)

$justText = strip_tags('<a href="#">This is a test</a><b>Another test</b>');

然后你会有文字，所以你可以使用str_replace("new text", $justText);

你可能不得不使用strip_tags()的第二个参数来分解它，以保持标签分开。

$html = '<a href="#">This is a test</a><b>Another test</b>';
$anchorText = strip_tags($html, '<a>');
$paraText = strip_tags($html, '<p>');
$html = str_replace("new anchor text", $anchorText);
$html = str_replace("new paragraph text", $paraText);

Answer 3

我会调查类似str_replace()

的内容

Answer 4

Here解释了如何删除所有html内容（html标签，脚本，css），然后使用str_replace你可以替换你想要的任何东西。

Answer 5

如果这是做客户端的选项，我会建议jQuery replaceWith()

Answer 6

这里的关键是使用正则表达式，在某种意义上，解析HTML ...

所以你要使用：

<?php

$str = "<a href =\"\">Hello</a>"; //The string to search

preg_match('/(<.+>)??.+(<\/.+>)??/i',$str,$match); //Find all occurences and store the tag content in an array called $match

echo $match[0]; //Echo the first value

?>

这基本上搜索输入字符串（您设置为页面的HTML）并返回标记之间的每个文本匹配作为数组中的值。对于第一个标签，该值将存储在$ match [0]中，第二个存储在$ match [1]中。

首先找到一个以HTML标记开头并以HTML标记结尾的模式，但不选择任何一个标记，只留下选中的内容。

希望这有帮助！

Braeden

使用PHP在HTML之间查找短语/单词

6 个答案: