Question

如何使用apache commons csv跳过输入文件中的行。在我的文件中，前几行是垃圾有用的元信息，如日期等。找不到任何选项。

private void parse() throws Exception {
    Iterable<CSVRecord> records = CSVFormat.EXCEL
            .withQuote('"').withDelimiter(';').parse(new FileReader("example.csv"));
    for (CSVRecord csvRecord : records) {
        //do something            
    }
}

Answer 1

在启动FileReader.readLine()之前使用for-loop。

您的例子：

private void parse() throws Exception {
  FileReader reader = new FileReader("example.csv");
  reader.readLine(); // Read the first/current line.

  Iterable <CSVRecord> records = CSVFormat.EXCEL.withQuote('"').withDelimiter(';').parse(reader);
  for (CSVRecord csvRecord: records) {
    // do something
  }
}

Answer 2

没有内置设施可以跳过未知数量的线路。

如果您只想跳过第一行（标题行），可以在构建解析器时调用withSkipHeaderRecord()。

更通用的解决方案是在迭代器上调用next()：

Iterable<CSVRecord> parser = CSVFormat.DEFAULT.parse(new FileReader("example.csv"));
Iterator<CSVRecord> iterator = parser.iterator();

for (int i = 0; i < amountToSkip; i++) {
    if (iterator.hasNext()) {
        iterator.next();
    }
}

while (iterator.hasNext()) {
    CSVRecord record = iterator.next();
    System.out.println(record);
}

Answer 3

因此，CSVParser.iterator() 绝对不应该在iterator.hasNext()上引发异常，因为它使得在错误情况下几乎无法恢复。

但是有志者事竟成，我提出了一个 sorta有效的想法

    public void runOnFile(Path file) {
        try {
            BufferedReader in = fixHeaders(file);
            CSVParser parsed = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(in);
            Map<String, Integer> headerMap = parsed.getHeaderMap();

            String line;
            while ((line = in.readLine()) != null) {
                try {
                    CSVRecord record = CSVFormat.DEFAULT.withHeader(headerMap.keySet().toArray(new String[headerMap.keySet().size()]))
                            .parse(new StringReader(line)).getRecords().get(0);
                    // do something with your record
                } catch (Exception e) {
                    System.out.println("ignoring line:" + line);
                }
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

Apache commons csv跳过行

3 个答案: