我想将csv文件的内容传输到mysql。在我的csv文件中,有些列的文本包含逗号。
我使用下面的代码来传输内容
`
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.util.Date;
import org.apache.commons.lang.StringUtils;
import au.com.bytecode.opencsv.CSVReader;
public class CSVLoader {
static int count;
private static final
String SQL_INSERT = "INSERT INTO ${table}(${keys}) VALUES(${values})";
private static final String TABLE_REGEX = "\\$\\{table\\}";
private static final String KEYS_REGEX = "\\$\\{keys\\}";
private static final String VALUES_REGEX = "\\$\\{values\\}";
private Connection connection;
private char seprator;
/**
* Public constructor to build CSVLoader object with
* Connection details. The connection is closed on success
* or failure.
* @param connection
*/
public CSVLoader(Connection connection) {
this.connection = connection;
//Set default separator
this.seprator = ',';
}
/**
* Parse CSV file using OpenCSV library and load in
* given database table.
* @param csvFile Input CSV file
* @param tableName Database table name to import data
* @param truncateBeforeLoad Truncate the table before inserting
* new records.
* @throws Exception
*/
public void loadCSV(String csvFile, String tableName,
boolean truncateBeforeLoad) throws Exception {
CSVReader csvReader = null;
if(null == this.connection) {
throw new Exception("Not a valid connection.");
}
try {
csvReader = new CSVReader(new FileReader(csvFile), this.seprator);
} catch (Exception e) {
e.printStackTrace();
throw new Exception("Error occured while executing file. "
+ e.getMessage());
}
//String[] headerRow = csvReader.readNext();
String[] headerRow = csvReader.readNext();
count++;
if (null == headerRow) {
throw new FileNotFoundException(
"No columns defined in given CSV file." +
"Please check the CSV file format.");
}
String questionmarks = StringUtils.repeat("?,", headerRow.length);
System.out.println(headerRow.length);
questionmarks = (String) questionmarks.subSequence(0, questionmarks
.length() - 1);
String query = SQL_INSERT.replaceFirst(TABLE_REGEX, tableName);
query = query
.replaceFirst(KEYS_REGEX, StringUtils.join(headerRow, ","));
query = query.replaceFirst(VALUES_REGEX, questionmarks);
System.out.println("Query: " + query);
String[] nextLine;
Connection con = null;
PreparedStatement ps = null;
try {
con = this.connection;
con.setAutoCommit(false);
ps = con.prepareStatement(query);
if(truncateBeforeLoad) {
//delete data from table before loading csv
con.createStatement().execute("DELETE FROM " + tableName);
}
final int batchSize = 1000;
int count = 0;
Date date = null;
while ((nextLine = csvReader.readNext()) != null) {
if (null != nextLine) {
int index = 1;
for (String string : nextLine) {
date = DateUtil.convertToDate(string);
if (null != date) {
ps.setDate(index++, new java.sql.Date(date
.getTime()));
} else {
ps.setString(index++, string);
}
}
System.out.println(count);
ps.addBatch();
System.out.println(count);
}
if (++count % batchSize == 0) {
System.out.println(count);
ps.executeBatch();
}
}
ps.executeBatch(); // insert remaining records
con.commit();
} catch (Exception e) {
con.rollback();
e.printStackTrace();
throw new Exception(
"Error occured while loading data from file to database."
+ e.getMessage());
} finally {
if (null != ps)
ps.close();
if (null != con)
con.close();
csvReader.close();
}
}
public char getSeprator() {
return seprator;
}
public void setSeprator(char seprator) {
this.seprator = seprator;
}
}
` 执行时我收到错误"没有为参数23"指定值。 我的数据库表有22列,csv文件也有22列。所以我猜测在第一行本身有一个文本,其中有一个逗号,它无法解析它,因此它假定为23列而不是22。 任何人都可以帮助我澄清问题并为我提供解决方案。
答案 0 :(得分:0)
我认为当前的问题是,在将列名插入SQL语句时,不要转义列名。你正在创建的是这种形式的陈述:
INSERT INTO sometable(key1,key2,key3) VALUES(?,?,?)
现在,如果你在标题行中有一个逗号(假设一个键是“ke,y3”),即使你的CSV库正确读取它,你也会创建这样的东西:
INSERT INTO sometable(key1,key2,ke,y3) VALUES(?,?,?)
现在,您的值数量和列数不匹配。请注意,对于其他一些字符也可能发生这种情况:也许您在一个键中有一个问号被解释为参数占位符?
解决方案:为了省去一些头痛,如果可能的话,请在键中避免使用这些字符。我不确定mysql如何正确处理它们,但如果确实如此,你需要在插入之前至少转义列名。我不确定你会如何正确安全地做到这一点(以防止SQL注入),但由于这显然是一次性工具,将列名称包装在这样的反引号中应该足够了:
INSERT INTO sometable(`key1`,`key2`,`ke,y3`) VALUES(?,?,?)
答案 1 :(得分:-1)
CSV文件中有两种类型的逗号。一种逗号分隔字段,另一种逗号是文本的一部分,始终出现在引号之间。您需要以不同于引号内的逗号来解析引号之外的逗号。您的代码似乎没有这样做。也许是这样的事情:
repeat
c <-read next character
if (c == '"')
parse quoted field // May include commas.
else
parse non-quoted field // Will not include commas.
endif
until file all read.
使用不同的方法来解析引用和非引用的字段,可以很容易地正确处理这两种类型的逗号。