正则表达式用“和”替换相同的单词

时间:2016-05-30 22:00:08

标签: javascript regex

在下面的例句中:

  

绿色衬衫绿帽

是否可以使用正则表达式检测2个相同的单词并将第二个单词替换为and以成为:

  

绿色衬衫和帽子

一个更难的字符串示例。这里需要替换第一个相同的单词:

  

你是一位具有艺术天赋的音乐天才

应该成为:

  

你是一位具有艺术天赋和音乐天赋的人

5 个答案:

答案 0 :(得分:5)

描述

首先,正则表达式不是最理想的解决方案,但我相信你有理由使用它。

\1and\4

替换为: and

Regular expression visualization

摘要

此正则表达式将在字符串中找到两个相同的单词,并将第二个单词替换为Green shirt green hat Green shirt greenish hat You are an artistically gifted musically gifted individual

实施例

现场演示

GRPC

示例文字

Green shirt and hat
Green shirt greenish hat
You are an artistically gifted musically and individual

样本匹配

NODE                     EXPLANATION
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    (                        group and capture to \2:
----------------------------------------------------------------------
      \b                       the boundary between a word char (\w)
                               and something that is not a word char
----------------------------------------------------------------------
      [a-z]{1,}                any character of: 'a' to 'z' (at least
                               1 times (matching the most amount
                               possible))
----------------------------------------------------------------------
      \b                       the boundary between a word char (\w)
                               and something that is not a word char
----------------------------------------------------------------------
    )                        end of \2
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (                        group and capture to \3:
----------------------------------------------------------------------
      \b                       the boundary between a word char (\w)
                               and something that is not a word char
----------------------------------------------------------------------
    \2                       what was matched by capture \2
----------------------------------------------------------------------
      \b                       the boundary between a word char (\w)
                               and something that is not a word char
----------------------------------------------------------------------
  )                        end of \3
----------------------------------------------------------------------
  (                        group and capture to \4:
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \4
----------------------------------------------------------------------
  $                        before an optional \n, and the end of a
                           "line"
----------------------------------------------------------------------

解释

a-z

额外信用

虽然OP中没有解决,但如果相关字词使用非[a-z]个字符,那么您可以将[a-z]|[^\x00-\x7F]替换为与\b\2\b匹配的非英文字符。但是,我们需要将(?<=\s|^)\2(?=\s|$)更改为((\b(?:[a-z]|[^\x00-\x7F]){1,}\b).*?)((?<=\s|^)\2(?=\s|$))(.*)$ ,以便我们确保正确匹配。

mocha mytest.js --require myglobals.js

Regular expression visualization

现场演示 https://regex101.com/r/yG3yM6/2

答案 1 :(得分:2)

通过修改this answer,您可以执行此操作:

&#13;
&#13;
console.log( myFunc("Green shirt green hat") );
console.log( myFunc("Big red eyed rabbits red Ferrari") );

function myFunc(str) {
    return str.replace(/\b(\w+)(.+)(\1)\b/gi, "$1$2and");
}
&#13;
&#13;
&#13;

答案 2 :(得分:1)

您可以使用RegExp /(\bgreen\b)/ig,其中green是要匹配的字词,String.prototype.replace(),在替换函数中达到p2

  

p1p2,...第n个带括号的子匹配字符串,前提是   replace()的第一个参数是RegExp对象。 (对应于   上面有$1$2等。)例如,如果给出了/(\a+)(\b+)/,   p1匹配\a+p2匹配\b+

green替换为and

&#13;
&#13;
var str = "Green shirt green hat green";
var re = function(m, p1, p2, index) {
  return p2 ? "and" : m
}
str = str.replace(/(\bgreen\b)/ig, re);
console.log(str);
&#13;
&#13;
&#13;

答案 3 :(得分:0)

您可以使用以下内容:

/(\b([^\s]+)\b.*?)\b\2\b/gi

测试用例:

var regex = /(\b([^\s]+)\b.*?)\b\2\b/gi;
'Green shirt green hat with blue shoes blue glasses'.replace(regex, '$1and')
  === 'Green shirt and hat with blue shoes and glasses';
'Orange colored oranges orange belts'.replace(regex, '$1and')
  === 'Orange colored oranges and belts';

Try it online

答案 4 :(得分:0)

您的第一个示例的答案 - 我将其视为用&#39;&#39; 替换第一个重复单词的第二个 - 是:

&#13;
&#13;
var str = 'Green shirt green hat';

str = str.replace(/(\b\S+\b)(.+?)(\b\1\b)/i, '$1$2and');

console.log(str);
&#13;
&#13;
&#13;

你的第二个例子 - 我读作用&#39;&#39; 替换第一个重复的单词 - 的答案是:

&#13;
&#13;
var str = 'You are an artistically gifted musically gifted individual';

str = str.replace(/(\b\S+\b)(.+?)(\b\1\b)/i, 'and$2$1');

console.log(str);
&#13;
&#13;
&#13;