Question

我正在遍历一个页面，该页面使用jquery向电话号码的某些正则表达式模式添加<a href="tel:"></a>。我已成功使用$("body:first").html()遍历页面，但是，我想排除一些标签，例如：<img> <scripts></script>，因为如果它们匹配我的正则表达式中的特定模式，它们往往会破坏页面

我尝试：$("body:first").not("script").not("img").html();没有成功。我遇到了一些内容，这些内容是我不感兴趣的。我缺少什么吗？我注销了

反正有没有使用.not（）链接这些的倍数？

https://jsfiddle.net/xpvt214o/552428/

Answer 1

对所有HTML内容进行正则表达式替换并不是最好的主意，而且使用jQuery遍历所有潜在的候选节点可能会非常昂贵。据我了解您的问题，您只想替换DOM textNodes中的电话号码-在这种情况下，所有现代浏览器都具有本机和高性能TreeWalker-您可以将其配置为仅遍历textNodes并执行一些操作使用过滤器方法进行其他微调。

在您的情况下，这意味着获取所有不在script，style和svg标记内的textNode。此外，a标签也应被忽略，因为嵌套的锚点无效。首先，我们必须收集所有匹配的textnode，然后用正则表达式替换其parentNode的内容。

在我的示例中，我使用了.innerHTML（不安全）和匹配的textNode的.replaceWith，但是操作正确，我们应该以可以与while(regex.exec(text))一起使用的方式更改正则表达式遍历匹配项，并将textNode以及锚点标签附加到其parentNode（将其内容重置为null之后）。

var phoneRegex = /(\b)((\+?[ ]?1?\(?)([\s ]?[\s-]?[\s.]?[\s ]?)(\(?[2-9]\d{2}\)?)([\s ]?[\s-]?[\s.]?[\s ]?)([2-9]\d{2}?)([\s ]?[\s-]?[\s.]?[\s ]?)(\d{4}\)?))(\b)/,
    // These node types can have TextNode children but we filter them out
    excludeNodes = ['SCRIPT', 'STYLE', 'SVG', 'A'],
    // Create a native treeWalker instance, seeking only text nodes
    // but ignore node types within excludeNodes and content which 
    // is not matching your regex
    treeWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT, {
      acceptNode: function(node) {
        if(excludeNodes.indexOf(node.parentNode.nodeName.toUpperCase()) > -1
          || !phoneRegex.test(node.data)) return NodeFilter.FILTER_REJECT
        return NodeFilter.FILTER_ACCEPT;
      }
    }),
    textNodes = [];

// Collect all matching text nodes (We can't do a live replacement of
// the nodes, otherwise the treewalker will break)
while(treeWalker.nextNode()) textNodes.push(treeWalker.currentNode);

// Replace all matching text nodes with a span if they match the phone regex
// and substitute the phone numbers with a surrounding <a> tag
//
// Note that the usage of .innerHTML is potentially unsafe, you could
// make this more secure by constructing textnodes and anchors and
// appending them to their parent
textNodes.forEach(function(textNode) {
  var newNode = document.createElement('span');
  newNode.innerHTML = textNode.textContent.replace(phoneRegex, "<a href=\"tel:$&\">$&</a>");
  textNode.replaceWith(newNode);
});

<p>1-888-452-1505</p>

<p>1(408)5625504</p>

<p><a href="mailto:some@bo.dy">1(408)5625504</a></p>

<p>1.408.562.5504</p>

<p>1-613-3568772</p>

<p>(1)9543615599</p>

<p>1.954.361.5599</p>

<p>+1.954.361.5599</p>

<p>954.361.5599</p>

<p>+1 954 361-5599</p>

<p>(954) 361-5599</p>

<p>(954)361-5599</p>

<p>9543615599</p>

<p>(954)3615599</p>

<p>+19543615599</p>
 
<p>1-954-361-5599</p>

<p>+1-954-361-5599</p>

<p>954 361-5599</p>

<p>Prefix Text +1-954-361-5599</p>

<p>+1-954-361-5599 Post Text</p>

<div>
  Nested items
  <ul>
    <li>954 361-5599</li>
  </ul>
  <p>(954)3615599</p>
</div>
<script>console.log("ignore me please (954)3615599")</script>
<svg>Some svg data, ignore phone numbers like +1-954-361-5599 etc.</svg>

Answer 2

您的代码应该可以工作，尽管您不想只抓住主体，但希望主体中的所有元素。

尝试以下操作：

var elements = $('body *').not('script').not('span').get(0);

console.log(elements);

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<span></span>
<div></div>

Answer 3

I forked your jsfiddle，并使其与您的正则表达式配合使用并替换每个适用的文本节点

const phoneRegex = /555-555-5555/
const phoneMarkup = "<a href=\"tel:$&\">$&</a>"
const hyperlink = text => text.replace(phoneRegex, phoneMarkup)

$("body :not(:empty):not(script):not(img):not(svg)")
  .contents()
  .each(function() {
    if (3 !== this.nodeType) return
    const text = this.textContent
    if (!text) return
    const markup = hyperlink(text)
    if (markup === text) return
    $(this).replaceWith(markup)
  })

通过CSS3+ :not链接选择，或通过jQuery .not用逗号分隔
- $("body :not(:empty):not(script):not(img):not(svg)")
- $("body :not(:empty)").not("script,img,svg")
迭代.contents()
1. 跳过非文本节点
2. 通过Node.textContent
3. 跳过空节点
4. 使用正则表达式替换超链接文本
5. 避免过多的DOM操作
6. 通过.replaceWith()替换文本节点

Answer 4

更简单的方法是将要忽略的标签存储在数组中，并使用toString来获取其内容的逗号分隔值。由于jQuery.fn.not支持传递多个用逗号分隔的选择器，因此可以按需运行。

此外，请确保使用body *选择器选择主体的所有后代，因为您当前使用的选择器仅选择body元素。

获取所有所需元素后，您可以遍历它们并在每个元素上执行正则表达式，并在替换适当的部分后用其innerHTML更新其textContent。

示例：

/* ------ JavaScript ----- */
const regex = /(\b)((\+?[ ]?1?\(?)([\s ]?[\s-]?[\s.]?[\s ]?)(\(?[2-9]\d{2}\)?)([\s ]?[\s-]?[\s.]?[\s ]?)([2-9]\d{2}?)([\s ]?[\s-]?[\s.]?[\s ]?)(\d{4}\)?))(\b)/;
const ignoredTags = ["script", "svg", "img"];
const wantedElements = $("body *").not(ignoredTags.toString());

wantedElements.each(function (index, element) {
  element.innerHTML = element.textContent.replace(regex, "<a href=\"tel:$&\">$&</a>");
});

<!----- HTML ----->
<script src="//ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<p>1-888-452-1505</p>
<p>1(408)5625504</p>
<p>1.408.562.5504</p>
<p>1-613-3568772</p>
<p>Prefix Text +1-954-361-5599</p>
<p>+1-954-361-5599 Post Text</p>
<script>console.log("ignore me please")</script>
<svg></svg>
<img src="http://via.placeholder.com/350x150">

使用jquery遍历$（“ body”）除外某些标签

4 个答案: