如何避免CSVFormat中逗号前的反斜杠

时间:2021-07-05 14:34:37

标签: java csv apache-commons apache-commons-csv

我正在使用 Java 中的 CSVFormat 创建一个 CSV 文件,我在标题和值中面临的问题是每当字符串很长并且有一个逗号时,api 总是在逗号之前插入一个 \。因此,标题未正确形成,csv 文件中的值正在为 .csv 文件取下一个单元格。我正在发布我所做的代码

   try (CSVPrinter csvPrinter = new CSVPrinter(out,
            CSVFormat.DEFAULT.withHeader("\""+SampleEnum.MY_NAME.getHeader()+"\"", "\""+SampleEnum.MY_TITLE.getHeader()+"\"",
                    "\""+SampleEnum.MY_ID.getHeader()+"\"", "\""+SampleEnum.MY_NUMBER.getHeader()+"\"", "\""+SampleEnum.MY_EXTERNAL_KEY.getHeader()+"\"",
                    "\""+SampleEnum.DATE.getHeader()+"\"","\""+SampleEnum.MY_ACTION.getHeader()+"\"",
                    "\"\"\""+SampleEnum.MY__DEFI.getHeader()+"\"\"\"", SampleEnum.MY_ACTION.getHeader(),
                    SampleEnum.CCHK.getHeader(), SampleEnum.DISTANCE_FROM_LOCATION.getHeader(),
                    SampleEnum.TCOE.getHeader(), SampleEnum.HGTR.getHeader(),SampleEnum._BLANK.getHeader(),
                    SampleEnum.LOCATION_MAP.getHeader(), SampleEnum.SUBMISSION_ID.getHeader())                      
                    .withDelimiter(',').withEscape('\\').withQuote('"').withTrim().withQuoteMode(QuoteMode.MINIMAL)
    )) {
        sampleModel.forEach(sf -> {
            try {
                csvPrinter.printRecord(sf.getMyName(),
                        sf.getMyTitle(),
                        sf.getMyID(),
                        sf.getMyNo(),

所以现在的问题是我得到这样的输出

"\"Name:\"","\"Title\"","\"ID #:\"","\"Store #:\"","\"Store #: External Key\"","\"Date:\"","\"\"\"It's performance  issue in detail to include dates,times, circumstances, etc.\"\"\""

我在每个逗号之前得到 \,当它出现在值中时,文本的下一部分将移到下一个单元格。

我需要的输出是

"Name:","Title:","Employee ID #:","Store #:","Store #: CurrierKey","Date:","Stage of Disciplinary Action:","""Describe your view about the company, times, circumstances, etc.""",

我正在努力 https://commons.apache.org/proper/commons-csv/jacoco/org.apache.commons.csv/CSVFormat.java.html 这个链接,但我无法理解修复。请帮忙。

1 个答案:

答案 0 :(得分:1)

发生这种情况是因为您正在使用具有以下 Javadoc 的 QuoteMode.NONE

<块引用>

从不引用字段。当分隔符出现在数据中时,打印机会在它前面加上转义字符。如果未设置转义字符,格式验证会抛出异常。

您可以使用 QuoteMode.MINIMAL 仅引用包含特殊字符(例如字段分隔符、引号字符或行分隔符字符串的字符)的字段。


如果您不能使用其他格式之一,我建议您使用 CSVFormat.DEFAULT 然后自己配置所有内容。检查反斜杠 (\) 是否真的适合您的用例。通常它是一个双引号 (")。此外,您可能希望删除标头定义中的所有双引号,因为它们会根据您的配置自动添加(如有必要)。

StringBuilder out = new StringBuilder();
try (CSVPrinter csvPrinter = new CSVPrinter(out,
        CSVFormat.DEFAULT
                .withHeader("AAAA", "BB\"BB", "CC,CC", "DD'DD")
                .withDelimiter(',')
                .withEscape('\\') // <- maybe you want '"' instead
                .withQuote('"').withRecordSeparator('\n').withTrim()
                .withQuoteMode(QuoteMode.MINIMAL)
)) {
    csvPrinter.printRecord("WWWW", "XX\"XX", "YY,YY", "ZZ'ZZ");
}
System.out.println(out);
AAAA,"BB\"BB","CC,CC",DD'DD
WWWW,"XX\"XX","YY,YY",ZZ'ZZ

在您的 edit 之后,您似乎希望所有字段都用双引号作为转义字符引用。因此,您可以像这样使用 QuoteMode.ALL.withEscape('"')

StringBuilder out = new StringBuilder();
try (CSVPrinter csvPrinter = new CSVPrinter(out,
        CSVFormat.DEFAULT
                .withHeader("AAAA", "BB\"BB", "CC,CC", "\"DD\"", "1")
                .withDelimiter(',')
                .withEscape('"')
                .withQuote('"').withRecordSeparator('\n').withTrim()
                .withQuoteMode(QuoteMode.ALL)
)) {
    csvPrinter.printRecord("WWWW", "XX\"XX", "YY,YY", "\"DD\"", "2");
}
System.out.println(out);
"AAAA","BB""BB","CC,CC","""DD""","1"
"WWWW","XX""XX","YY,YY","""DD""","2"

在您的 comment 中,您声明您只需要在需要时使用双引号,并且只需要一个字段的三引号。然后,您可以按照第一个示例中的建议使用 QuoteMode.MINIMAL.withEscape('"')。当您用双引号将该字段的输入括起来时会生成三重引号(一次是因为有一个特殊字符并且该字段需要被引用,第二个是因为您添加了显式 ",第三个是那里是为了逃避您的明确引用)。

StringBuilder out = new StringBuilder();
try (CSVPrinter csvPrinter = new CSVPrinter(out,
        CSVFormat.DEFAULT
                .withHeader("AAAA", "BB\"BB", "CC,CC", "\"DD\"", "1")
                .withDelimiter(',')
                .withEscape('"')
                .withQuote('"').withRecordSeparator('\n').withTrim()
                .withQuoteMode(QuoteMode.MINIMAL)
)) {
    csvPrinter.printRecord("WWWW", "XX\"XX", "YY,YY", "\"DD\"", "2");
}
System.out.println(out);
AAAA,"BB""BB","CC,CC","""DD""",1
WWWW,"XX""XX","YY,YY","""DD""",2

根据 chat,您希望完全控制标题何时有引号,何时没有。没有 QuoteMode 和转义字符的组合可以提供所需的结果。因此,我建议您手动构建标题:

StringBuilder out = new StringBuilder();
try (CSVPrinter csvPrinter = new CSVPrinter(out,
        CSVFormat.DEFAULT
                .withDelimiter(',').withEscape('"')
                .withQuote('"').withRecordSeparator('\n').withTrim()
                .withQuoteMode(QuoteMode.MINIMAL))
) {
    out.append(String.join(",", "\"AAAA\"", "\"BBBB\"", "\"CC,CC\"", "\"\"\"DD\"\"\"", "1"));
    out.append("\n");
    csvPrinter.printRecord("WWWW", "XX\"XX", "YY,YY", "\"DD\"", "2");
}
System.out.println(out);
"AAAA","BBBB","CC,CC","""DD""",1
WWWW,"XX""XX","YY,YY","""DD""",2