从格式化文件输入Java

时间:2010-11-30 20:19:23

标签: java parsing

我正在尝试从具有各种不同行的文件中获取输入。

e.g。格式为书名,作者:借款人第二名第一名:借款国

这是一些示例行。

The Lord of the Rings, JRR Tolkien:McInnes Elizabeth:13 11 10

Crime And Punishment, Fyodor Dostoyevsky

The Clan Of The Cave Bear, Jean M Auel

The God Of Small Things, Arundhati Roy:Robins Joshua:20 11 10

因此我在设置扫描仪后尝试使用useDelimiter,但由于某些线条较短,我不知道该做什么。

5 个答案:

答案 0 :(得分:4)

以下是基于正则表达式的解决方案:

import java.io.*;
import java.util.regex.*;

public class Test {
    public static void main(String[] args) throws IOException {

        BufferedReader br = new BufferedReader(new FileReader("data.txt"));

        Pattern p = Pattern.compile("(.+?),(.+?)(?::(.+?):(\\d+ \\d+ \\d+))?");

        String line;
        while (null != (line = br.readLine())) {
            Matcher m = p.matcher(line);
            if (m.matches()) {
                String title = m.group(1);
                String author = m.group(2);
                String borrower = m.group(3);
                String data = m.group(4);

                System.out.println("Title:  " + title);
                System.out.println("Author: " + author);
                if (borrower != null) {
                    System.out.println("    Borrower: " + borrower);
                    System.out.println("    Data:     " + data);
                }
            }
            System.out.println();
        }

        br.close();
    }
}

根据您的样本输入,它会打印:

Title:  The Lord of the Rings
Author:  JRR Tolkien
    Borrower: McInnes Elizabeth
    Data:     13 11 10

Title:  Crime And Punishment
Author:  Fyodor Dostoyevsky

Title:  The Clan Of The Cave Bear
Author:  Jean M Auel

Title:  The God Of Small Things
Author:  Arundhati Roy
    Borrower: Robins Joshua
    Data:     20 11 10

答案 1 :(得分:0)

逐行读取文件,使用[^,:]正则表达式匹配每行中的数据(顺序find将带来标题,作者和借款人,州,如果有的话)。

答案 2 :(得分:0)

我将分割每一行(使用String.split)作为分隔符传入冒号。然后对split返回的第一个元素使用lastIndexOf(','),以便将作者与书分开:

public class ReadCrappyInput {

    public static List<String> testData() {
        List<String> lines = new ArrayList<String>();
        lines.add("The Lord of the Rings, JRR Tolkien:McInnes Elizabeth:13 11 10");
        lines.add("Crime And Punishment, Fyodor Dostoyevsky");
        lines.add("The Clan Of The Cave Bear, Jean M Auel");
        lines.add("The God Of Small Things, Arundhati Roy:Robins Joshua:20 11 10");
        return lines;
    }

    public Map<String, String> readLine(String line) {
        String[] parts = line.split(":");
        int endOfTitleIndex = parts[0].lastIndexOf(',');
        Map<String, String> map = new HashMap<String, String>();
        map.put("title", parts[0].substring(0, endOfTitleIndex));
        map.put("author", parts[0].substring(endOfTitleIndex + 1).trim());    
        if (parts.length > 1) {
            map.put("borrower", parts[1]);
        }
        if (parts.length > 2) {
            map.put("data", parts[2]);
        }
        return map;
    }

    public static void main(String[] args) {
        ReadCrappyInput r = new ReadCrappyInput();
        for (String s : testData()) {
            System.out.println(r.readLine(s));
        }
    }
}

答案 3 :(得分:0)

你可以使用.split()

try {
    FileInputStream fstream = new FileInputStream("input.txt");
    DataInputStream in = new DataInputStream(fstream);
    BufferedReader br = new BufferedReader(new InputStreamReader(in));

    String line;
    String author;
    String title;
    String borrower;
    String date;

    while ((line = br.readLine()) != null)   {
      (author,title) = line.split(",");

      if (line.contains(":")
        (title,borrower,date) = title.split(":");

      /*** Do what you need to do with the values here ***/
    }

    in.close();

} catch (Exception e) {
    e.printStackTrace();
}

答案 4 :(得分:0)

为什么总是为这些琐碎的任务提出像正则表达式这样的超大工具呢?为什么不简单地使用旧的line.indexOf()line.lastIndexOf()方法?