我使用来自火花的表格创建了一个数据框,该表格包含来自PostgreSQL这样的几何图形
val df = sparkSession.read.format("jdbc")
.option("url", pgInfo)
.option("dbtable", "SELECT * FROM tableName")
.option("user", "user")
.option("password", "password")
.option("header", "true")
.option("driver", "org.postgresql.Driver").load
然后,数据帧再次存储在新表中。
df.filter(row => ~~ ).write.mode(SaveMode.Overwrite).jdbc("pg_info1", "new_table", "pg_info2")
当我应用包含polygons
的表时,它可以正常工作。但是,当应用包含linestring
的表时,会发生以下错误:
Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO linestring_table ("geom","val") VALUES ('0102000020110F00000C00000086584F5529496A410F80812D274D4541E641D63829496A416AED7E75194D4541AE49ED1F29496A41178BA0CD0A4D45410936C4C329496A41869567EF064D454104A6EEA92F496A410A642E67E84C4541F0375FE330496A41DE709EC8DB4C4541232DB89F30496A412D1C8D53CF4C45417707220E31496A4108D93FCDCC4C45412DF9BD7634496A412068C137C24C4541F9DC327F37496A4184EA9324B04C4541E107F7BD3C496A4199B6C256944C4541ACC0E3133E496A4100F6692C8D4C4541','1234') was aborted: ERROR: column "geom" is of type geometry but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 73 Call getNextException to see other errors in the batch.
at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:154)
at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:50)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2269)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:511)
at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:851)
at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:874)
at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1569)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:659)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:821)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:821)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: column "geom" is of type geometry but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 73
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2533)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2268)
... 17 more