我想在String中的特定单词之前和之后获得11个单词。
例如:
and WINSOCK 2.0 in Visual Studio 2012/2013, compiled as Release for use on 64-bit and 32-bit Windows Servers. Client application discovers and validates qualifying Windows Server product
现在,这里的挑战是识别像32这样的单词,该单词通过hyhen连接到单词位。如果我将这个词改为32位而不是32位,那么正则表达式会在句子前后识别并得到11个词。
我的正则表达式看起来像
Pattern pattern = Pattern.compile("(?<!-)\\b(?<!&)(" + "\\b" + word + "\\b" + ")(?!&)\\b(?!-)(?:[^a-zA-Z'-]+[a-zA-Z'-]+){0,11}");
对此我寻求任何帮助。
PS注意*我无法识别带有连字符的单词
@Solution 感谢@Wiktor
\\b(?<!&)\\b" + word + "\\b(?!&)(?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}
谢谢。
答案 0 :(得分:1)
您可以从正则表达式中“取出”连字符:
"\\b(?<!&)" + word + "\\b(?!&)(?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}"
或者,如果单词可能以特殊字符开头/结尾:
"(?<![&\\w])" + Pattern.quote(word) + "(?![&\\w])(?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}"
请参见regex demo
详细信息
\b(?<!&)
-不带&
的单词边界word
-可变词(请注意,如果Pattern.quote(word)
可能以/开头,则可能需要用"\\b(?<!&)" + word + "\\b(?!&)"
对其进行转义,或者甚至将"(?<![&\\w])" + word + "(?![&\\w])"
替换为word
。以特殊字符结尾)\b(?!&)
-&
之后的单词边界(?:[^a-zA-Z']+[a-zA-Z'-]+){0,11}
-0个或多个序列:
[^a-zA-Z']+
-ASCII字母或'
之外的1个以上的字符[a-zA-Z'-]+
-1个以上的ASCII字母或'
。答案 1 :(得分:0)
如果您不必使用正则表达式,则:
String message = "and WINSOCK 2.0 in Visual Studio 2012/2013, compiled as Release for use on 64-bit and 32-bit Windows Servers. Client application discovers and validates qualifying Windows Server product";
String target = "32-bit";
int index = message.indexOf("32-bit");
int lenght = target.length();
String before = message.substring(index -11, index);
String after = message.substring(index + lenght , index + lenght + 11);
Log.i("tag", "index: " + index);
Log.i("tag", "before: " + before);
Log.i("tag", "after: " + after);