我有一个csv文件如下(控制字符以粗体显示)
"ID","NAME","CLASS"CRLF "1","JOHN X","A"CRLF "2","DOELF Y","B"CRLF "3","OTHER S", "D"CRLF
请注意,第3行以LF而非CRLF结尾。在用Java读取这个CSV文件时,我得到5行而不是4行(标题行+3数据行)。有没有办法可以在保留CRLF的同时用空格替换LF(按下输入文件或更改java代码)。我做了大量的谷歌搜索,我可以看到每个解决方案都替换了LF和CRLF。
由于
答案 0 :(得分:1)
您可以使用Scanner
,其分隔符为\n
。使用jlordo的技术来摆脱LF
,您可以一次将内容写入一行OutputStream
。这样你就不会在内存中拥有整个2GB +文件
public static void main(String[] args) throws Exception {
File file = new File("C:\\Users\\Soto\\Downloads\\person.xml");
Scanner scanner = new Scanner(new FileInputStream(file));
String lineSeparator = System.getProperty("line.separator"); // Assuming you are on Windows, otherwise set it to \n
scanner.useDelimiter(lineSeparator);
ByteArrayOutputStream out = new ByteArrayOutputStream(); // would be a real outputstream, like FileOutputStream
char LF = 0xA;
while (scanner.hasNext()) { // looks up to the next delimiter
String line = scanner.next();
line = line.replace("" + LF, "");
out.write(line.getBytes());
out.write(lineSeparator.getBytes());
}
// the OutputStream now contains the content with new lines but no LF
}
LF
是十六进制A
,请参阅here。
答案 1 :(得分:1)
这应该有效:
char LF = 0x0A;
char CR = 0x0D;
String content = ... // your lines(s)
content = content.replaceAll("(?<!" + CR + ")" + LF, " ");
只有在前面没有LF
的情况下才构造正则表达式,以便用空格替换CR
。
答案 2 :(得分:-1)
您必须按照此处的说明设置正确的系统属性(line.separator):http://docs.oracle.com/javase/tutorial/essential/environment/sysprop.html
希望它能解决问题。 干杯