我使用RedShift JDBC driver在redshift和下面的代码中编写了数据帧:
def writeUsingJDBC(dataFrame: DataFrame, url: String, tableName: String) = {
try {
dataFrame.coalesce(100).write.mode(SaveMode.Append)
.option("batchsize", 100000)
.jdbc(url, tableName, property)
} catch {
case ex: Exception =>
println("Error in writing data into redshift using JDBC driver: " + ex.getMessage)
}
}
但是上面的代码无法正确写入100000条记录,这需要138秒,我的红移集群是M3XLarge
1个主设备和3个核心