Apache spark - 数据集写入方法不能像postgresql数据库那样工作

时间:2018-02-09 12:31:54

标签: java postgresql apache-spark apache-spark-sql

我需要将我的spark数据集写入postgresql现有表。我正在使用附加模式的数据集写入方法。虽然使用我得到像表一样的异常已经存在。很奇怪,我已经给了Append模式。当我将选项更改为sqlserver / oracle时,它正在处理异常。

Spark版本:我在2.1.0和2.2.1中执行了 PostgreSQL:9.5.6
JDBC驱动程序:我使用旧的(9.4-1201-jdbc41)和最新的(2.0.0)

数据库属性:

destinationProps.put("driver", "org.postgresql.Driver");
destinationProps.put("url", "jdbc:postgresql://127.0.0.1:30001/dbmig");
destinationProps.put("user", "dbmig");
destinationProps.put("password", "dbmig");

数据集编写代码:

valueAnalysisDataset.write().mode(SaveMode.Append).jdbc(destinationDbMap.get("url"), "dqvalue", destinationdbProperties);

例外:

Exception in thread "main" org.postgresql.util.PSQLException: ERROR: relation "dqvalue" already exists
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297)
    at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
    at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301)
    at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287)
    at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264)
    at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:806)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:95)
    at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:469)
    at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
    at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:460)
    at com.ads.dqam.action.impl.PostgresValueAnalysis.persistValueAnalysis(PostgresValueAnalysis.java:25)
    at com.ads.dqam.action.AbstractValueAnalysis.persistAnalysis(AbstractValueAnalysis.java:81)
    at com.ads.dqam.Analysis.doAnalysis(Analysis.java:32)
    at com.ads.dqam.Client.main(Client.java:71)

Console output for exception and postgresql query execution

0 个答案:

没有答案