Question

我正在尝试为以下情况编写正则表达式：

badword%
%badword
%badword%

%标志不同，具体取决于它们的位置。前面的%需要一个lookbehind来匹配单词badword之前的字母，直到它到达非字母。同样地，任何不在前面的%都需要前瞻以匹配单词badword后面的字母，直到它出现非字母为止。

这就是我想要实现的目标。如果我有以下内容：

只是一个常规的超级记录。

badword   # should match "badword", easy enough
badword%  # should match "badwording"
%badword% # should match "superbadwording"

同时。如果我有一个类似的句子：

这是另一个非常简单的例子。

badword   # should match "badword", easy enough
badword%  # should also match "badword"
%badword% # should match "verybadword"

我不想使用空格作为断言捕获组。假设我要捕获\w。

这是我到目前为止用Java编写的内容：

String badword  = "%badword%";
String _badword = badword.replace("%", "");
badword = badword.replaceAll("^(?!%)%", "(?=\w)"); // match a % NOT at the beginning of a string, replace with look ahead that captures \w, not working
badword = badword.replaceAll("^%", "(?!=\w)"); // match a % at the beginning of a string, replace it with a look behind that captures \w, not working
System.out.println(badword); // ????

那么，我该怎么做呢？

PS：请不要认为%被迫进行比赛的开始和结束。如果%是第一个字符，那么它需要后面看，任何和所有其他%都是向前看。

Answer 1

从您的问题来看，似乎没有必要使用外观，因此您只需将所有%替换为\w*

段：

String tested = "Just a regular superbadwording sentece.";
String bad = "%badword%";
bad = bad.replaceAll("%", "\\\\w*");
Pattern p = Pattern.compile(bad);
Matcher m = p.matcher(tested);
while(m.find()) {
    String found = m.group();
    System.out.println(found);
}

\ w与＃， - 等不匹配。所以我觉得\ S在这里更好

Answer 2

badword = badword.replaceAll("^%", "(?!=\w)"); 
// match a % at the beginning of a string, replace it with a look behind 
//that captures \w, not working

(?!=\w)对于=\w来说是一个负面展望，但似乎你想要一个积极的后顾之忧。其次，前瞻和后视是原子的，因此本身就不会捕捉，所以如果我的解释是正确的，你想要：

"(?<=(\\w+))"。您需要额外的()进行捕获。对于您的第一部分，它将是："(?=(\\w+))，第一个参数应为"(?<!^)%"。

PS：\\w需要两个反斜杠，而你似乎想要匹配多个字符，不是吗？如果是这样，您需要\\w+。另外，如果您不希望每次都这样做，那么我建议您使用String.format()代替replaceAll()。

正则表达式捕获后观和前瞻

2 个答案: