我想在找到一个引用词并且在该引用词和冒号位置之后的单词之后从文件中提取单词,例如:
XXXXXXXX............REFERENCE : word_to_extract .....XXXXXXX
如何提取word_to_extract
答案 0 :(得分:0)
以下是模式:(?<=reference_word\s?:\s?)([a-zA-Z_0-9]+)
说明:
(?<= # positive lookbehind
reference_word # literally reference_word, could be changed
\s? # none or one space
: # literally colon
\s?) # none or one space
([a-zA-Z_0-9]+) # word to extract (letters, digits, underscores)
代码:
List<String> allMatches = new ArrayList<>();
String pattern = "(?<=reference_word\\s?:\\s?)([a-zA-Z_0-9]+)";
String testString = "XXX...reference_word : word_to_extract ...XXXX ...reference_word: word_to_extract2 ...reference_word :word_to_extract3 scz";
Matcher m = Pattern.compile(pattern).matcher(testString);
while (m.find()) {
allMatches.add(m.group());
}
System.out.println(allMatches);
输出:
[word_to_extract, word_to_extract2, word_to_extract3]