Question

我正在使用正则表达式查找文本块中的所有电话号码。现在我想忽略已经包含在<a>标记中的所有数字。

这是我到目前为止所得到的：

(\+[1-9][0-9]*(\([0-9]*\)|-[0-9]*-))?[0]?[1-9][0-9\- ]*\d

它工作得很好，但我无法弄清楚如何忽略包裹数字。

任何提示或指示？

看到它的实际效果： https://regex101.com/r/ACyBON/3

Answer 1

我设法通过遵循BurnsBA的建议去旅行树并忽略一个节点来解决我的问题。

// First I create a variable with the desired regex
// While this won't catch all number formats, 
// it works decently for my use case.
var telRegex = new RegExp(/(\+[1-9][0-9]*(\([0-9]*\)|-[0-9]*-))?[0]?[1-9][0-9\- ]*\d/, 'g');

// We get all <a> tags with a tel:-href
var $telLinks = $('a[href^="tel:"]');
var linkArray = [];
var content;

// Next we clone all <a href=tel:> tags to an array
$telLinks.each(function(index, link) {

    linkArray.push($(link).clone());

});

// Then we replace these <a> tags with placeholders.
$('a[href^="tel:"]').replaceWith('<a href="#" class="tmp-tel"></>');

// Now all manually placed <a href=tel:> tags won't 
// bother us when we run our regex
content = $('body').html();

// We wrap all identified phone numbers with an <a href=tel:> tag
// and replace the content
content = content.replace(telRegex, '<a href="tel:$&">$&</a>');
$('body').html(content);

// Then we find all of our placeholders and replace then with
// our saved elements
$('body').find('a.tmp-tel').each(function(index, value) {

    value.outerHTML = linkArray[index][0].outerHTML;

});

有最明确的方法可以做到这一点，但这个解决方案解决了我的问题。

可以在下面查看完整示例：

https://jsfiddle.net/krabban/y8r7z17a/

查找没有包装元素的电话号码

1 个答案: