Java Regex以递归方式在特定单词前后获得11个单词

时间:2019-09-09 07:35:01

标签: java regex

我想在String中的特定单词之前和之后获得11个单词。

例如:

and WINSOCK 2.0 in Visual Studio 2012/2013, compiled as Release for use on 64-bit and 32-bit Windows Servers. Client application discovers and validates qualifying Windows Server product

现在,这里的挑战是识别像32这样的单词,该单词通过hyhen连接到单词位。如果我将这个词改为32位而不是32位,那么正则表达式会在句子前后识别并得到11个词。

我的正则表达式看起来像

Pattern pattern = Pattern.compile("(?<!-)\\b(?<!&)(" + "\\b" + word + "\\b" + ")(?!&)\\b(?!-)(?:[^a-zA-Z'-]+[a-zA-Z'-]+){0,11}");

对此我寻求任何帮助。

PS注意*我无法识别带有连字符的单词

@Solution 感谢@Wiktor

\\b(?<!&)\\b" + word + "\\b(?!&)(?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}

谢谢。

2 个答案:

答案 0 :(得分:1)

您可以从正则表达式中“取出”连字符:

"\\b(?<!&)" + word + "\\b(?!&)(?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}"

或者,如果单词可能以特殊字符开头/结尾:

"(?<![&\\w])" + Pattern.quote(word) + "(?![&\\w])(?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}"

请参见regex demo

详细信息

  • \b(?<!&)-不带&的单词边界
  • word-可变词(请注意,如果Pattern.quote(word)可能以/开头,则可能需要用"\\b(?<!&)" + word + "\\b(?!&)"对其进行转义,或者甚至将"(?<![&\\w])" + word + "(?![&\\w])"替换为word。以特殊字符结尾)
  • \b(?!&)-&之后的单词边界
  • (?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}-0个或多个序列:
    • [^a-zA-Z']+-ASCII字母或'之外的1个以上的字符
    • [a-zA-Z'-]+-1个以上的ASCII字母或'

答案 1 :(得分:0)

如果您不必使用正则表达式,则:

String message = "and WINSOCK 2.0 in Visual Studio 2012/2013, compiled as Release for use on 64-bit and 32-bit Windows Servers. Client application discovers and validates qualifying Windows Server product";


String target = "32-bit";
int index = message.indexOf("32-bit");
int lenght = target.length();

String before = message.substring(index -11, index);
String after = message.substring(index + lenght , index + lenght + 11);

Log.i("tag", "index: " + index);
Log.i("tag", "before: " + before);
Log.i("tag", "after: " + after);