Java模式匹配除给定列表之外的任何字符序列

时间:2009-03-23 07:03:30

标签: java regex

如何编写模式(Java)以匹配除给定的单词列表之外的任何字符序列?

我需要查找给定的代码是否包含除了给定的单词列表之外的标签所包围的任何文本。 例如,我想检查标签周围是否还有“one”和“two”之外还有其他单词。

"This is the first tag <span>one</span> and this is the third <span>three</span>"

模式应该与上面的字符串匹配,因为单词“three”被标记包围,并且不是给定单词列表的一部分(“one”,“two”)。

3 个答案:

答案 0 :(得分:7)

前瞻可以做到这一点:

\b(?!your|given|list|of|exclusions)\w+\b

匹配

  • 单词边界(单词开头)
  • 没有跟随“你的”,“给定”,“列表”,“of”,“exclusions”中的任何一个
  • 后跟多个单词字符
  • 后跟一个单词边界(词尾)

实际上,这匹配任何未排除的单词。

答案 1 :(得分:4)

这应该让你开始。

import java.util.regex.*;

// >(?!one<|two<)(\w+)/
// 
// Match the character “>” literally «>»
// Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!one|two)»
//    Match either the regular expression below (attempting the next alternative only if this one fails) «one»
//       Match the characters “one<” literally «one»
//    Or match regular expression number 2 below (the entire group fails if this one fails to match) «two»
//       Match the characters “two<” literally «two»
// Match the regular expression below and capture its match into backreference number 1 «(\w+)»
//    Match a single character that is a “word character” (letters, digits, etc.) «\w+»
//       Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
// Match the characters “/” literally «</»
List<String> matchList = new ArrayList<String>();
try {
    Pattern regex = Pattern.compile(">(?!one<|two<)(\\w+)/");
    Matcher regexMatcher = regex.matcher(subjectString);
    while (regexMatcher.find()) {
        matchList.add(regexMatcher.group(1));
    } 
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}

答案 2 :(得分:2)

使用此:

if (!Pattern.matches(".*(word1|word2|word3).*", "word1")) {
    System.out.println("We're good.");
};

您正在检查该模式是否 与字符串匹配。