正则表达式:匹配除HTML标记内容之外的所有内容

时间:2016-07-14 05:04:46

标签: javascript regex replace brackets

字符串:

for(int i = 0; i < 80*300; i++) // Default Height of cmd is 300 and Default width is 80
    System.out.print("\b"); // Prints a backspace

结果(替换为0):

This <b>is</b> <i>a random</i> <u>String</u> which contains <a href="www.example.com">stuff</a>.

所以除了括号(及其内容)以及以a开头的括号所包围的所有内容之外,我希望所有内容都被替换。

我来的最远的是:

00000<b>00</b>0<i>00000000</i>0<u>000000</u>000000000000000<a href="www.example.com">stuff</a>0

只能完成一半的工作。

2 个答案:

答案 0 :(得分:1)

您可以迭代<code>个子节点,使用for循环,if条件,如果.nodeTagName等于"A".replace() RegExp/.*/g } .textContent 0,用var div = document.querySelector("div"); var nodes = div.querySelector("code").childNodes; for (var i = 0; i < nodes.length; i++) { if (nodes[i].tagName !== "A") { nodes[i].textContent = nodes[i].textContent.replace(/.*/g, 0) } }替换节点<div> <code> This \<b>is</b> <i>a random</i> <u>String</u> which contains <a href="www.example.com">stuff</a>. </code> </div>

>>> b = [
    {"email_address": "aaa", "verify_score": "75"},
    {"email_address": "bbb", "verify_score": "75"},
    {"email_address": "Emailjcb.ab.baseball@gmail.com", "verify_score": "10"},
    {"email_address": "aaa", "verify_score": "75"},
    {"email_address": "carolpaterick@gmail.com", "verify_score": "10"},
    {"email_address": "37a11ce00909300817u2ca1bb5ka40e8422d4bc38b2@mail.gmail.com", "verify_score": "10"},
]
>>> b = set(list(b))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> 
Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'dict'

答案 1 :(得分:0)

所以匹配标签之外的内容,带有一个例外 然后用相同数量的一个字符替换该内容。

要以正则表达式方式执行此操作,我们需要将函数传递给replace。

可以根据正则表达式找到的捕获组创建新字符串的函数。

&#13;
&#13;
var str = 'This <b>is</b> <i>a random</i> <u>String</u> which contains <a href="www.example.com">stuff</a>.';

// 3 capture groups and 1 negative lookahead to avoid the a tags.
// If it wasn't for that content before and after the tags then this 
// could be done with only one capture group.  
// But now also the start and end needed to brought into $1 and $3.

var re = /(^|[>])(?![^<]*?<\/a)(.*?)([<]|$)/gi; 

var str = str.replace(re, function (m,g1,g2,g3) {
  return g1+"0".repeat(g2.length)+g3;
});

alert(str);
&#13;
&#13;
&#13;

注意:

repeat是一种ECMAScript 6方法 但是大多数浏览器现在应该能够使用该方法。