如何使用正则表达式在另一个单词的范围内找到单词?

时间:2014-10-30 02:39:07

标签: java regex

如果我有一个字符串 “word3 word2 word3 word4 word5 word3 word7 word8 word9 word10”

我希望找到所有“word3”,使其在“word5”的3个字内, 我会得到第二次和第三次出现的“word3”匹配

我会使用什么正则表达式或逻辑?我有两种方法可以接近它,但对我来说它们看起来非常低效。

1 个答案:

答案 0 :(得分:1)

您没有定义单词,因此我将其作为单词字符序列,这里是一种不专门使用正则表达式的方法,通过迭代分割String:< / p>

String str = "word3 word2 word3 word4 word5 word3 word7 word8 word9 word10";
String[] words = str.split("\\W+");
for (int i = 0; i < words.length; i++) {
    // Iterate in an inner loop for nearby elements if "word5" is found.
    if (words[i].equals("word5"))
        for (int j = Math.max(0, i - 3); j < Math.min(words.length, i + 3); j++)
            if (words[j].equals("word3")) {
                // Do something with words[j] to show that you know it exists.
                // Or use it right here instead of assigning this debug value.
                words[j] = "foo";
            }
}
// Prints the result.
for (final String word : words)
    System.out.println(word);
  

Code Demo STDOUT:

word3
word2
foo
word4
word5
foo
word7
word8
word9
word10

否则,这是正则表达式替换:

Pattern pattern = Pattern.compile("word3(?=(?:\\W*\\w++){0,2}?\\W*+word5)|(word5(?:\\W*\\w++){0,2}?\\W*+)word3");
Matcher matcher;
String str = "word3 word2 word3 word4 word5 word3 word7 word8 word9 word10";
while ((matcher = pattern.matcher(str)).find())
    // Do something with matcher.group(1) to show that you know it exists.
    // Or use it right here instead of replacing with this empty value.
    str = matcher.replaceFirst(matcher.group(1) == null ? "" : matcher.group(1));
System.out.println(str);

然而,当这个正则表达式工作时,替换掉第三个单词word3,认为第一个单词word3能够被替换掉,这就是为什么正则表达式不是这样做的原因。

  

Code Demo STDOUT:

 word2  word4 word5  word7 word8 word9 word10

进行这项工作的小修改将是:

str = matcher.replaceFirst((matcher.group(1) == null ? "" : matcher.group(1)) + "baz");