Java Regex Matcher错误

时间:2018-01-23 09:09:35

标签: java regex

我的字符串:
2017.11.22样本新闻 - 我在这里和那里有一些文字 2018.12.30新闻样本2 - 我在这里和那里都有一些文字

countLine():

public static int countLines(String filename) throws IOException {
        InputStream is = new BufferedInputStream(new FileInputStream(filename));
        try {
            byte[] c = new byte[1024];
            int count = 0;
            int readChars = 0;
            boolean endsWithoutNewLine = false;
            while ((readChars = is.read(c)) != -1) {
                for (int i = 0; i < readChars; ++i) {
                    if (c[i] == '\n')
                        ++count;
                }
                endsWithoutNewLine = (c[readChars - 1] != '\n');
            }
            if (endsWithoutNewLine) {
                ++count;
            }
            return count;
        } finally {
            is.close();
        }
    }

我的代码匹配字符串上的正则表达式 - loadTextFromFile():

public static String loadTextFromFile(String filename, int type) throws FileNotFoundException {
        String match = "";
        File file = new File(filename);
        Scanner scanner = new Scanner(file);
        if (type == 0) { // Match date
            while (scanner.hasNext()) {
                String line = scanner.nextLine();
                Matcher m = Pattern.compile("(\\d{4}[\\.]\\d{2}[\\.]\\d{2})").matcher(line);
                while (m.find()) {
                    match = m.group(1).trim();
                    System.out.println("date: " + match);
                }
            }
        } else { 
            while (scanner.hasNext()) {
                String line = scanner.nextLine();
                Matcher m = Pattern.compile("((?!.*[\\d+\\.?\\d+]).*$)").matcher(line);
                while (m.find()) {
                    match = m.group(1).trim();
                    System.out.println("text: " + match);
                }
            }
        }
        return match;
    }

主要():

String date, string;
        for (int i = 0; i < countLines(FILE_NAME); i++) {
            date = loadTextFromFile(FILE_NAME, 0);
            string = loadTextFromFile(FILE_NAME, 1);
            System.out.println("date:" + i + " " + date);
            System.out.println("string:" + i + " " + string);
        }

输出:

date:0 2018.12.30  
string:0   
date:1 2018.12.30   
string:1   
count: 2  

我确信这是正则表达式的一个问题,但我无法理解它在哪里。我调试了应用程序并进行了检查,它进入while(m.find())两次,生成string="",意味着它找到了正则表达式的多个匹配项。

但是如何解决这个问题,我只想用正则表达式提取SAMPLE NEWS - I am some text here and there

0 个答案:

没有答案