CSV替换LF时保留CRLF

时间:2013-09-02 22:33:36

标签: java unix csv

我有一个csv文件如下(控制字符以粗体显示)

"ID","NAME","CLASS"CRLF
"1","JOHN X","A"CRLF
"2","DOELF
Y","B"CRLF
"3","OTHER S", "D"CRLF

请注意,第3行以LF而非CRLF结尾。在用Java读取这个CSV文件时,我得到5行而不是4行(标题行+3数据行)。有没有办法可以在保留CRLF的同时用空格替换LF(按下输入文件或更改java代码)。我做了大量的谷歌搜索,我可以看到每个解决方案都替换了LF和CRLF。

由于

3 个答案:

答案 0 :(得分:1)

您可以使用Scanner,其分隔符为\n。使用jlordo的技术来摆脱LF,您可以一次将内容写入一行OutputStream。这样你就不会在内存中拥有整个2GB +文件

public static void main(String[] args) throws Exception {   
    File file = new File("C:\\Users\\Soto\\Downloads\\person.xml");
    Scanner scanner = new Scanner(new FileInputStream(file));
    String lineSeparator = System.getProperty("line.separator"); // Assuming you are on Windows, otherwise set it to \n
    scanner.useDelimiter(lineSeparator);
    ByteArrayOutputStream out = new ByteArrayOutputStream(); // would be a real outputstream, like FileOutputStream
    char LF = 0xA; 

    while (scanner.hasNext()) { // looks up to the next delimiter
        String line = scanner.next();
        line = line.replace("" + LF, "");
        out.write(line.getBytes());
        out.write(lineSeparator.getBytes());
    }

    // the OutputStream now contains the content with new lines but no LF
}

LF是十六进制A,请参阅here

答案 1 :(得分:1)

这应该有效:

char LF = 0x0A;
char CR = 0x0D;
String content = ... // your lines(s)
content = content.replaceAll("(?<!" + CR + ")" + LF, " ");

只有在前面没有LF的情况下才构造正则表达式,以便用空格替换CR

答案 2 :(得分:-1)

您必须按照此处的说明设置正确的系统属性(line.separator):http://docs.oracle.com/javase/tutorial/essential/environment/sysprop.html

希望它能解决问题。 干杯