正则表达式逃避HTML标记

时间:2015-04-01 08:22:44

标签: javascript html regex

我想仅从html标记字符串名称中提取。 我想获得这个结果:

  • 农场1
  • STAGING
  • STAGING_SYSTEM_10

我可以使用哪种类型的正则表达式?

<div class='singleNode'><i class='fa fa-cogs'></i><span>Farm 1<span class='badge badge-primary'></span><span></div>

 <div class='singleNode'><i class='fa fa-cubes'></i><span>STAGING<span class='badge badge-primary'></span><span></div>

<div class='singleNode'><i class='fa fa-desktop'></i><span>STAGING_SYSTEM_10<span class='badge badge-primary'></span><span></div>

1 个答案:

答案 0 :(得分:5)

如果你必须使用正则表达式,这里是示例代码:

var re = /<div[^>]*?>(?:<(\S+)[^>]*?>[^<]*?<\/\1>)+<span[^]*?>([^<]*?)(?=<span )/g; 
var str = '<div class=\'singleNode\'><i class=\'fa fa-cogs\'></i><span>Farm 1<span class=\'badge badge-primary\'></span><span></div>\n\n <div class=\'singleNode\'><i class=\'fa fa-cubes\'></i><span>STAGING<span class=\'badge badge-primary\'></span><span></div>\n\n<div class=\'singleNode\'><i class=\'fa fa-desktop\'></i><span>STAGING_SYSTEM_10<span class=\'badge badge-primary\'></span><span></div>';
var m;
 
while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    // View your result using the m-variable.
    // m[2] is the 2nd capture group, the text inside the DIV element
    alert(m[2])
}

如果您可以使用DOM解析它,请使用以下代码:

var input = document.getElementsByTagName("div");
for ($i = 0; $i < input.length; $i++)
{
   alert(input[$i].textContent);
}
<body>
<div class='singleNode'><i class='fa fa-cogs'></i><span>Farm 1<span class='badge badge-primary'></span><span></div>

 <div class='singleNode'><i class='fa fa-cubes'></i><span>STAGING<span class='badge badge-primary'></span><span></div>

<div class='singleNode'><i class='fa fa-desktop'></i><span>STAGING_SYSTEM_10<span class='badge badge-primary'></span><span></div>
  </body>