RegEx仅在<tag>和</tag>内替换

时间:2014-11-14 10:51:52

标签: javascript regex replace

我们使用RegExp替换来搜索文本中的术语并使用<dfn>打包命中。这就像一个魅力,直到我们有一个术语包含几个被包装的单词,然后一个术语只包含其中一个单词。例如:

这些是其中一个案例的示例,其中包含以下术语:
&#34;人类设计系统&#34;,&#34;设计&#34;。

因此,我们的代码首先找到&#34;人类设计系统&#34;,用<dfn>标签包裹它,然后找到&#34; Design&#34;在其中并用<dfn>标签包装。

结果变为:

<dfn>Human <dfn>Design</dfn> System</dfn>

当我们想要结果时:

<dfn>Human Design System</dfn>

所以我们需要的是一种检查术语是否被<dfn></dfn>包裹并且只是跳过这些情况的替换的方法。

这是我们现在使用的代码:

//Definition of variables, please not that ~open~ is replaced by <dfn> and ~close~ is replaced by </dfn> later  
var TPL_TAG_OPEN = '~open~',
    TPL_TAG_CLOSE = '~close~',
    ESCAPERS = '[\\s!:\.\;,%\"\'\\(\\)\\{\\}]';

//This is the RegExp that prepares the content
//term is the term that we are looking for and line is the text we are searching in

var re = new RegExp("^("+term+")(" + ESCAPERS + ")", modifier);
line = line.replace(re, TPL_TAG_OPEN + "$1" + TPL_TAG_CLOSE + "$2");

re = new RegExp("(" + ESCAPERS + ")("+term+")$", modifier);
line = line.replace(re, "$1" + TPL_TAG_OPEN + "$2" + TPL_TAG_CLOSE);

re = new RegExp("(" + ESCAPERS + ")("+term+")(" + ESCAPERS + ")", modifier);
line = line.replace(re, "$1" + TPL_TAG_OPEN +"$2" + TPL_TAG_CLOSE + "$3");

输入:

<dfn>Human Design System</dfn> Human Design Design Human Testar test Human Design Test 
Human Test Design Test Test Design <dfn>Human Design System</dfn> Test Human Design

现在的结果:

<dfn>Human <dfn>Design</dfn> System</dfn> Human <dfn>Design</dfn <dfn>Design</dfn> 
Human Testar test Human <dfn>Design</dfn Test Human Test <dfn>Design</dfn> Test Test
<dfn>Design</dfn> <dfn>Human <dfn>Design</dfn> System</dfn> Test Human <dfn>Design</dfn>

通缉结果:

<dfn>Human Design System</dfn> Human <dfn>Design</dfn> <dfn>Design</dfn> 
Human Testar test Human <dfn>Design</dfn> Test Human Test <dfn>Design</dfn> 
Test Test <dfn>Design</dfn> <dfn>Human Design System</dfn> Test Human <dfn>Design</dfn>

注意:

我们已成功检查该术语是否已被标记包装,但仅使用RegExp .test函数,但如果这样做会阻止文本继续并检查文本的其余部分,则代码为:< / p>

var pattern = RegExp("^("+TPL_TAG_OPEN+").*((?!"+TPL_TAG_CLOSE+").).*("+term+")*$");

if (pattern.test(line))
     return false;

最终解决方案:

var ESCAPERS = '[\\s!:\.\;,%\"\'\\(\\)\\{\\}]';
var terms = ['Design','Human Design System','This and That...'];
terms = terms.join('|');
re = new RegExp("(" + ESCAPERS + "|^)(" + terms + ")(" + ESCAPERS + "|$)",'gi');
nodes.contents().filter()
     .each(function(){
          $(this).replaceWith(this.nodeValue.replace(re, '$1<dfn class=\"thesaurus\">$2</dfn>$3'));
     });

3 个答案:

答案 0 :(得分:1)

一次性完成所有操作:

&#13;
&#13;
var s = 'Human Design System Human Design Design Human Testar test ' +
        'Human Design Test Human Test Design Test Test Design Human ' +
        'Design System Test Human Design';

// Alternative matches are tried in sequence.
var t = s.replace(/Human Design System|Design/g, '<dfn>$&</dfn>');

console.log(t);
&#13;
&#13;
&#13;

或者,以增量方式进行:

&#13;
&#13;
var s = 'Human Design System Human Design Design Human Testar test ' +
        'Human Design Test Human Test Design Test Test Design Human ' +
        'Design System Test Human Design';

var adddfn = function(s, term){
    return s.replace(/(.*?)(<dfn>.*?<\/dfn>|$)/g, function(all, one, two){
        return one.replace(RegExp(term, 'g'), '<dfn>$&</dfn>') + two;
    });
};

var terms = ['Human Design System', 'Design'];

var t = terms.reduce(function(result, term){
    return adddfn(result, term);
}, s);

console.log(t);
&#13;
&#13;
&#13;

答案 1 :(得分:0)

这些也是另一种方式,但单个正则表达式无法做到。

找到此匹配设计(<dfn>(?:(?!</?dfn>).)* )design( (?:(?!</?dfn>).)*</dfn>)

$1&tmp;$2替换设计

然后在匹配中找到Design并替换为<dfn>$&</dfn>

现在在&tmp;

中的dfn (<dfn>(?:(?!</?dfn>).)* )&tmp;( (?:(?!</?dfn>).)*</dfn>)内匹配

更改为$1Design$2

现在问题解决了。

如果您想在上面的代码中使用它。

答案 2 :(得分:0)

我只是匹配已经存在的标签并通过它们:

str = "<dfn>Human Design System</dfn> Human Design Design Human Testar test Human Design Test Human Test Design Test Test Design <dfn>Human Design System</dfn> Test Human Design";

str = str.replace(/(<dfn>.+?<\/dfn>)|(Human Design System|Design)/g, function(_, $1, $2) {
    return $1 || "<dfn>" + $2 + "</dfn>";
});

alert(str)