Java JDBC Postgres copyIn无法识别行尾和填充双引号

时间:2018-08-12 04:49:40

标签: java postgresql jdbc greenplum bulk-load

我正在尝试使用Java将数据从Oracle加载到Greenplum。我将结果集以逗号分隔的值存储在字节数组输入流中,然后使用copy in将其加载。

import java.sql.*; 
import au.com.bytecode.opencsv.CSVWriter;
import java.io.*;
import org.postgresql.copy.CopyManager;
import org.postgresql.core.BaseConnection;

public class ORtoGP {   
        public static void main(String[] args) throws SQLException {
            try {
                String dbURL = "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = xxxxxx)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = xxxxxx) (SRVR = DEDICATED)))";
                String strUserID = "xxxxxx";
                String strPassword = "xxxxxx";
                Connection myConnection=DriverManager.getConnection(dbURL,strUserID,strPassword);
                Statement sqlStatement = myConnection.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_READ_ONLY);
                String readRecordSQL = "select id,name from table where rownum <= 10 ";
                ResultSet rs = sqlStatement.executeQuery(readRecordSQL); 

                StringWriter stringWriter = new StringWriter();
                CSVWriter csvWriter = new CSVWriter(stringWriter);

                rs.first(); 
                csvWriter.writeAll(rs, true);
                String orresult = stringWriter.toString();
                System.out.println(orresult);

                byte[] bytes = orresult.getBytes();
                ByteArrayInputStream orinput = new ByteArrayInputStream(bytes); 


                String dbURL1 = "jdbc:postgresql://xxxxx:5432/xxxxx";
                String user = "xxxx";
                String pass = "xxxx";
                Connection conn2 = DriverManager.getConnection(dbURL1, user, pass);

                CopyManager copyManager = new CopyManager((BaseConnection) conn2);
                copyManager.copyIn("copy java_test from stdin with DELIMITER ','",orinput);

                rs.close();
                myConnection.close();
                csvWriter.close();

            } catch (Exception e) {
                System.out.println(e);
            }       
        }
    }

但是,我遇到了两个问题:

  1. 批量加载数据时,进程无法识别行尾。所以它给出了这个错误。 “错误:最后一个预期的列之后有多余的数据”
  2. 此外,它还会尝试加载包含值周围的双引号的数据。

1 个答案:

答案 0 :(得分:0)

根据documentation,默认格式为text,该格式不处理引号。

您需要在命令中指定FORMAT csv