Question

我尝试在RegExp（特别是Javascript函数）中使用match来查找HTML正文中该句子中的句子和单词的出现。以下是我的一些伪代码：

<!DOCTYPE html>
<html>
<body id="hello">

<p id="demo">Click the button to display the matches.</p>

<div> <input type="button" value="search" onclick="myFunction('<p id=&quot;demo&quot;>Click the button to display the matches', 'button')" />Try it </div>

<script>
function myFunction(sentence, word)
{
//var str="The rain in SPAIN stays mainly in the plain"; 
//var toMatch = "The rain in SPAIN stays mainly in the plain";
var r = new RegExp(word, 'g');
var oldHTML = document.getElementById("hello").innerHTML;
var n=oldHTML.match(r);
alert("no. of matches = " + n.length);
document.getElementById("demo").innerHTML=n;
}
</script>

</body>
</html>

在上面的HTML中，只有一次出现句子和一个单词“button”，但搜索次数= 4和n = button,button,button,button。

我的问题：
1.为什么RegExp导致4次搜索？ 2.如何搜索HTML body部分，以便我得到的答案是正确的？

Answer 1

正如其他人已经说过的那样，你会有4次出现，因为你搜索整个html标记，而不仅仅是用户可见的文本。
使用innerText属性代替innerHTML来获得更好的结果。

Answer 2

您可以使用jQuery的文本函数来获取body元素的文本，并从中进行搜索。

e.g 
bodyElement = $("body");
bodyText = bodyElement.text();

body中的match（）返回奇数结果

2 个答案: