Spark DataFrame写入JDBC-无法获取结构<date:int,day:int ... =“”>的JDBC类型?

时间:2018-09-02 13:21:38

标签: java apache-spark exception jdbc apache-spark-sql

我是火花新手,并且正在尝试将数据帧写入db2表。我得到的错误是:

Exception in thread "main" java.lang.IllegalArgumentException: Can't get JDBC type for struct <data:int, day:int, hours:int, minutes:int, month:int, seconds:int, time:bigint, timeZoneOffset:int, year:int>

我的数据库架构是

localId <-- Integer type
effectiveDate <-- Timestamp
activityDate <-- Timestamp
inDate <-- Timestamp
outDate <-- Timestamp

我为数据库表创建了一个POJO类,它像这样

public class StowageTable {
    private long localId;
    private Date effectiveDate;
    private Date activityDate;
    private Date inDate;
    private Date outDate;
    //setters and getters
}

然后,我基本上读取了一个与db表具有相同架构的csv,如下所示:

JavaRDD<String> dataFromCSV = javaSparkContext.textFile(fileURL);
//The I create a JavaRDD of the POJO type
JavaRDD<StowageTable> dataToPOJO = dataFromCSV.map((Function<String,  StowageTable) line -> {
    String[] fields = line.split(",");
    StowageTable st = createNewStowageTable(fields);
    return st;
});
//converting the RDD to DataFrame
DataFrame stowageTableDF = sqlContext.createDataFrame(dataToPOJO, StowageTable.class);
//call jdbc persister
persistToTable(stowageTableDF);

我的persistToTable(DataFrame df)方法如下:

private void persistToTable(DataFrame df) {
    Class.forName("")//driver here
    //skipping a few lines for brevity
    df.write().mode(SaveMode.Append).jdbc(url, table, connectionProperties);
}

我在这里找到了一些解决方案:Spark DataFrame write to JDBC - Can't get JDBC type for array<array<int>>java.lang.IllegalArgumentException: Can't get JDBC type for array<string>无法找到解决日期时间数据类型问题的任何内容。请建议我一些解决方案。我正在火花1.6.3。

1 个答案:

答案 0 :(得分:0)

由于我尚未找到任何答案,并且在此同时为自己找到了解决方案,因此这里是基本思想。如果数据库的数据类型为Timestamp,则必须在对象的POJO中使用Timestamp,然后将该Timestamp转换为spark的structtype。