从.csv列字段中删除逗号并导入到db

时间:2013-08-23 10:38:06

标签: java sql jdbc db2

下面的类会将.csv导入到数据库table.it工作正常。 但是,当它遇到像2,345这样的数值时。这会导致错误。

在我的.csv文件中有3列如下:

db2表“COMPUTER”中这些列的数据类型为COL_A(VArchar 50),COL_B(双精度),COL_C(Varchar 50)

COL_A | COL_B | COL_C


KKGG56 | 7,567 | JUNE2013

GGHHK2 | 259,024 | 2012年5月

那么,如何在导入db表时从特定列中删除这些逗号以及将代码放在程序中的位置?请帮忙。

public class CSVLoader {

private static final 
    String SQL_INSERT = "INSERT INTO OPM.${table}
         (${keys})      VALUES(${values})";

private static final String TABLE_REGEX = "\\$\\{table\\}";

private static final String KEYS_REGEX = "\\$\\{keys\\}";

private static final String VALUES_REGEX = "\\$\\{values\\}";

private Connection connection;

private char seprator;

public CSVLoader(Connection connection) {

    this.connection = connection;

    //Set default separator

    this.seprator = ',';
}

      public void loadCSV(String csvFile, String tableName) throws Exception {

    CSVReader csvReader = null;

    if(null == this.connection) {

        throw new Exception("Not a valid connection.");
    }

    try {

        csvReader = new CSVReader(new FileReader(csvFile), this.seprator);

    } catch (Exception e) {

        e.printStackTrace();

        throw new Exception("Error occured while executing file. "

                   + e.getMessage());

              }

        String[] headerRow = csvReader.readNext();

    if (null == headerRow) {

        throw new FileNotFoundException(


                        "No columns defined in given CSV file." +

                         "Please check the CSV file format.");
    }

    String questionmarks = StringUtils.repeat("?,", headerRow.length);

    questionmarks = (String) questionmarks.subSequence(0, questionmarks

            .length() - 1);


    String query = SQL_INSERT.replaceFirst(TABLE_REGEX, tableName);

    query = query
            .replaceFirst(KEYS_REGEX, StringUtils.join

             (headerRow,   ","));

    query = query.replaceFirst(VALUES_REGEX, questionmarks);

            System.out.println("Query: " + query);

    String[] nextLine;

    Connection con = null;

    PreparedStatement ps = null;

    try {
        con = this.connection;

        con.setAutoCommit(false);

        ps = con.prepareStatement(query);

                       final int batchSize = 1000;

                     int count = 0;

        Date date = null;

        while ((nextLine = csvReader.readNext()) != null) {

            System.out.println( "inside while" );

            if (null != nextLine) {

                int index = 1;

                for (String string : nextLine) {

                    date = DateUtil.convertToDate(string);

        if (null != date) {

                    ps.setDate(index++, new java.sql.Date(date

                    .getTime()));

                     } else {

                  ps.setString(index++, string);

    System.out.println( "string" +string);

                    }

                }

                ps.addBatch();

            }

            if (++count % batchSize == 0) {

                ps.executeBatch();

            }

                     }


        ps.executeBatch(); // insert remaining records

        con.commit();

    } catch (Exception e) {

        con.rollback();

        e.printStackTrace();

        throw new Exception(

        "Error occured while loading data 

                from file                to                      database."

               + e.getMessage());

    } finally {

             if (null != ps)


            ps.close();

        if (null != con)

            con.close();

            System.out.println("csvReader will be closed");

        csvReader.close();

    }

}

public char getSeprator() {

    return seprator;

}

public void setSeprator(char seprator) {

    this.seprator = seprator;

}


         }

2 个答案:

答案 0 :(得分:1)

回答你的问题:
您必须使用Double.parseDouble("2,345".replaceAll(",",""))解析CSV文字,但是您必须致电ps.setDouble()将双倍数据存储在数据库中,而不是ps.setString()

for (String string : nextLine) {
  date = DateUtil.convertToDate(string);

  if (null != date) {
    ps.setDate(index++, new java.sql.Date(date.getTime()));
  } 
  else {
     try {
       final double doubleValue = Double.parseDouble(string.replaceAll(",",""));

       ps.setDouble(index++, doubleValue);
     }
     catch(NumberFormatException e) {
       // For invalid double
       ps.setString(index++, string);
     }
  }

此代码不是那么强大,如果您在第3列中有日期或数字,您将遇到麻烦!看着 https://stackoverflow.com/questions/18067934/parsing-csv-file-with-java/18068238#18068238使用预先映射解决方案,您事先知道数据结构。

答案 1 :(得分:0)

假设其中只有一列包含逗号,请从字符串中提取前N列值(向前进行)。然后提取最后一个值(反向)。无论剩下的是中间栏目。