我试图在整个文本文件中捕获包含某些4个字母前缀的所有单词。这些前缀被指定为名为“keywordList”的数组列表。代码当前的方式,我正在捕获所有需要捕获的单词(包含数组列表中的任何前缀“keywordList”),但是,我得到重复,并且由于某种原因,有时它会打印空白行( s)打印出一个检测到的单词后。换句话说,打印输出的图案没有任何均匀性。
控制台输出:
Result:
Result:APXP5558899
Result:
Result:
Result:IGC088838383833
Result:
Result:CDAV
Result:
Result:ASHGJHSGDSAGD
Result:
Result:MOE1477347384
Result:
Result:GHTS348939438
Result:ASHGJHSGDSAGD
Result:
Result:MOE1477347384
Result:
Result:GHTS348939438
Result:EGLVxxxxxxxxxxxxx
Result:
Result:ESLVililillililil
Result:
Result:HYSC999xxx
Result:
我希望打印到这样的结果:
Result:APXP5558899
Result:IGC088838383833
Result:CDAV
Result:ASHGJHSGDSAGD
Result:MOE1477347384
Result:GHTS348939438
Result:ASHGJHSGDSAGD
Result:EGLVxxxxxxxxxxxxx
Result:ESLVililillililil
Result:HYSC999xxx
文字文件内容:
jkjfkjkjfkjkf jkjkfiiiiidijdjd
ddffdf
ddjjdkkii
jjjjd
sdhfjhdsfhjdsh APXP5558899 fdfsdsfsfsfgsfsdg
asjhdjsahjdhjsahd IGC088838383833 lllllllllpppppssss
JIJSIJSIJSJISJS
CDAV 337990099
kkkkkksslslsls
ASHGJHSGDSAGD MOE1477347384 GHTS348939438
EGLVxxxxxxxxxxxxx ESLVililillililil jdjdjdjdjdjdjddjdj
HYSC999xxx 6969696969696
我当前的代码:
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class searchPdftext
{
public static ArrayList<String> keywordList;
public static String [] TestArray;
public static String Possibilities;
public static void main(String args[]) throws Exception
{
//Possibilities = keyword : TestArray;// i took out an =
int tokencount;
FileReader fr = new FileReader("EvergreenDME.txt");
BufferedReader br = new BufferedReader(fr);
String s = "";
int linecount = 0;
keywordList = new ArrayList<String>(Arrays.asList("APXP", "IGC0", "CDAV", "COSB",
"ESLV", "2ISU", "SUDU", "5BUT", "HYSC",
"BNGF", "45HG", "NBCH", "MOE1", "RFGD",
"GHTS"));
String line;
while ((s = br.readLine()) != null) {
String[] lineWordList = s.split(" ");
for (String word : lineWordList) {
for (String keyword : keywordList) {
if (word.contains(keyword)) {
//System.out.println(s);
test(s);
break;
}
}
}
}
}
private static void test(String text) {
Matcher m = Pattern.compile("\\b"+keywordList+".*?\\b").matcher(text);//"\\bABC123.*?\\b"____Word boundary // (?<=^|\s)ABC123\S*__For White spaces
if (m.find()) {
System.out.println("Result:" + m.group()) ;
while (m.find()) {
System.out.println("Result:" + m.group()) ;//System.out.println("Result:" + m.group() +" ");
}
} else {
System.out.println("Not found: " + text);
}
}
}