Question

我有一个输入文件，其分隔符是＃$＃之类的字符的组合。但是，Apache Commons CSVParser仅考虑一个字符，而不考虑多个字符。请找到输入文件：

Rajeev Kumar Singh ♥#$#rajeevs@example.com#$#+91-9999999999#$#India
Sachin Tendulkar#$#sachin@example.com#$#+91-9999999998#$#India
Barak Obama#$#barak.obama@example.com#$#+1-1111111111#$#United States
Donald Trump#$#donald.trump@example.com#$#+1-2222222222#$#United States

代码段：

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class CSVReader {
    private static final String SAMPLE_CSV_FILE_PATH = "./users.csv";

    public static void main(String[] args) throws IOException {

        try (

                Reader reader = Files.newBufferedReader(Paths.get(SAMPLE_CSV_FILE_PATH));
                CSVParser csvParser = new CSVParser(reader,  CSVFormat.EXCEL.withDelimiter('#'))
        ) {
            long recordCount;
            List<CSVRecord> csvRecords = csvParser.getRecords();
        }
    }

}

在上面的示例中，请帮助我使用带有多个字符的定界符，定界符仅是单个字符，即“＃”。我需要将定界符设置为“＃$＃”。

Answer 1

我不太清楚为什么要在您的情况下使用CSVParser。我只是使用您的数据在本地对其进行了测试，并提出了此解析演示：

public static void main(String... args) {
    try (Stream<String> lines = Files.lines(Paths.get(Thread.currentThread().getContextClassLoader().getResource("csv.txt").toURI()))) {
        lines.forEach(line -> {
            String[] words = line.split("#\\$#");
            System.out.println(Arrays.toString(words));
        });
    } catch (URISyntaxException | IOException ignored) {
        ignored.printStackTrace();
    }
}

输出将是：

[Rajeev Kumar Singh ♥, rajeevs@example.com, +91-9999999999, India]
[Sachin Tendulkar, sachin@example.com, +91-9999999998, India]
[Barak Obama, barak.obama@example.com, +1-1111111111, United States]
[Donald Trump, donald.trump@example.com, +1-2222222222, United States]

顺便说一句， csv.txt 在resources中：

Answer 2

Resource

现在缓冲完成了两次，一个可以使用其他Reader / FileInputStream等。

Answer 3

 public List<CSVRecord> getCSVRecords(String path, String delimiter) throws IOException {
        List<CSVRecord> csvRecords = null;
        Stream<String> lines = Files.lines(Paths.get(path));
        List<String> replaced = lines.map(line -> line.replaceAll(Pattern.quote(delimiter), "§")).collect(Collectors.toList());
        try (
                BufferedReader buffer =
                        new BufferedReader(new StringReader(String.join("\n", replaced)));
                CSVParser csvParser = new CSVParser(buffer, CSVFormat.EXCEL.withDelimiter('§'))
        ) {
            csvRecords = csvParser.getRecords();
            return csvRecords;
        }
    }

如何使用apache commons csv使用具有多个字符的定界符

3 个答案: