我想在我的网页中更改号码,我不想破坏页面的HTML。什么是正确的方法?
我已经阅读了这个答案:RegEx match open tags except XHTML self-contained tags
然而,有一个skype插件以某种方式替换网页中的数字。它是如何做到的?
这是我的代码:
var formats = '(xxx) xxx-xxxx|(xxx)xxx-xxxx|xxx-xxx-xxxx|xxx.xxx.xxxx|xxx xxx xxxx';
var str = '('+formats.replace(/([\(\)\+\-])/g, '\\$1').replace(/x/g,'\\d') + ')';
var r = RegExp(str,'g');
document.body.innerHTML=document.body.innerHTML.replace(r,'<a style="color:#07C !important; font-size:100% !important;" href="https://call.com/number=$1">$1</a>');
我面临的问题是它与身体标签属性相混淆,例如:
<a href="https://stackoverflow.com/a/4338544/1269037">validate phone numbers properly</a>
替换为损坏的html:
<a href="https://stackoverflow.com/a/<a style=" color:#07c="" !important;="" font-size:100%="" !important;"="">4338544/1269</a>
和代码arround都搞砸了。
我认为RegEx模式定义不明确
答案 0 :(得分:1)
使用正则表达式来解析和处理HTML代码是一项几乎不可能完成的任务。总有一些边界案例会被遗漏。
更合理的方法是使用文档对象模型并遍历所有文本节点,然后单独处理这些文本。如果匹配,请再次使用DOM添加链接元素。
这是一个使用treeWalker:
的工作代码段
// Prepare search expression:
var formats = ['(xxx) xxx-xxxx',
'(xxx)xxx-xxxx',
'xxx-xxx-xxxx'];
var str = formats.join('|') // split patterns by OR operator
.replace(/[()+]/g, '\\$&') // escape special characters
.replace(/-/g, '[-. ]') // hyphen can be space or dot as well
.replace(/(^|[|])x/g, '$1\\bx') // require first digit to be start of a word
.replace(/x($|[|])/g, 'x\\b$1') // require last digit to be end of a word
.replace(/x/g, '\\d') // set digit placeholders
;
var r = RegExp('(' + str + ')', '');
var node;
// create a walker for visiting all text nodes in the document
var walker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT,
null, false);
while (node = walker.nextNode()) {
// Do not process SCRIPT, OPTION and some other tag contents
// You might need to extend this black-list:
if (node.parentNode.tagName.search(
/SCRIPT|SELECT|OPTION|BUTTON|TEXTAREA/) === -1) {
// split text of node into parts <non-phone><phone><non-phone>...
var parts = node.nodeValue.split(r);
while (parts.length > 1) {
var txt = parts.shift();
if (txt.length) {
// insert a text node for the non-phone text:
node.parentNode.insertBefore(document.createTextNode(txt), node);
}
// get phone number, create a link for it
var phone = parts.shift();
var a = document.createElement('a');
// set hyperlink, and pass digits only as URL argument:
a.setAttribute('href',
'https://call.com/number=' + phone.replace(/[^\d]/g, ''));
a.setAttribute('style',
'color:#07C !important; font-size:100% !important;');
a.textContent = phone;
// insert link into the document
node.parentNode.insertBefore(a, node);
}
// reduce the original node to the ending non-phone part
node.nodeValue = parts[0];
};
}
This is a test.
Following are valid:<br/>
<ul>
<li>Please dial:473-299-8154</li>
<li>or 678.269-1514, during weekends</li>
<li>Private (732 939 8549)</li>
<li>Back-up =(673) 137.4892</li>
</ul>
Do not match any of these:<br/>
<ul>
<li>a473-299-8154 because of a</li>
<li>473-299-81549 because of last 9</li>
<li>473/299.8154 because of slash</li>
</ul>
Some elements whose content should not be parsed:
<form id="myform">
<select id="sel">
<option value="phone">123.456.7890</option>
</select>
<input id="inp" type="text" value="123-321-1231">
<button>123-321-1231</button><br/>
<textarea>Links are not allowed in textareas:
123-321-1231</textarea>
</form>