我正在尝试查找适用于CSV文件的正则表达式(在值周围使用双引号),其中值可以包含任何字符。我现在使用的表达式是(在Java中,因此反斜杠被转义):
",(?=(([^\"\\\\]|\\\\.)*\"([^\"\\\\]|\\\\.)*\")*([^\"\\\\]|\\\\.)*$)"
我遇到的问题包括“random_value”或“random_value \”等条目。
其他信息:
"000000000000000","","","","email@yahoo.com","random_value""
"000000000000000","","","","email2@yahoo.com","random_value\"
答案 0 :(得分:0)
假设我们清理你的源文本以包含正确的结束引号,那么这个表达式将是:
\"
和""
(?:^|,)"((?<=")(?:[^"]*|\\"|"")*?)"(?=[,\r\n]|\Z)
现场演示:http://www.rubular.com/r/NSSxdHWcDM
示例文字
"1000000000000000","","","","email1@yahoo.com","1random_value"""
"2000000000000000","","","","email2@yahoo.com","2random_value\""
捕获论坛
[0][0] = "1000000000000000"
[0][1] = 1000000000000000
[1][0] = ,""
[1][1] =
[2][0] = ,""
[2][1] =
[3][0] = ,""
[3][1] =
[4][0] = ,"email1@yahoo.com"
[4][1] = email1@yahoo.com
[5][0] = ,"1random_value"""
[5][1] = 1random_value""
[6][0] = "2000000000000000"
[6][1] = 2000000000000000
[7][0] = ,""
[7][1] =
[8][0] = ,""
[8][1] =
[9][0] = ,""
[9][1] =
[10][0] = ,"email2@yahoo.com"
[10][1] = email2@yahoo.com
[11][0] = ,"2random_value\""
[11][1] = 2random_value\"
答案 1 :(得分:0)
String str = "\"000000000000000\",\"\",\"\",\"\",\"email2@yahoo.com\",\"random_value\\\"\"";
CsvReader reader = CsvReader.parse(str);
reader.readRecord();
for (int i=0; i<reader.getColumnCount(); i++)
System.out.printf("Scol[%d]: [%s]%n", i, reader.get(i));
<强>输出:强>
Scol[0]: [000000000000000]
Scol[1]: []
Scol[2]: []
Scol[3]: []
Scol[4]: [email2@yahoo.com]
Scol[5]: [random_value\"]