使用Databricks连接到AWS Postgres

时间:2019-05-02 15:35:00

标签: postgresql amazon-web-services pyspark azure-databricks

在从Azure Databricks连接到AWS Postgres时遇到问题,我是Azure的新手,下面是我用于连接到Postgres的代码,但由于某种原因它引发了错误 错误:org.postgresql.util.PSQLException:连接尝试失败。

代码:

jdbc_url="jdbc:postgresql://postgreshost:5432/db?user={}&password={}&ssl=true.format(username,password)"

pushdown_query = "(select * from test limit 10) emp_alias"
df = spark.read.jdbc(url=jdbc_url, table="test")
display(df)

第二种方法:

df = spark.read \
.format("jdbc") \
.option("url", "jdbc:postgresql://postgreshost:5432/db?user=user&password=password") \
.option("dbtable", "test") \
.load()

我错过了什么吗?还是应该在执行之前执行任何步骤?

使用Scala登录:

at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:275)
    at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
    at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:194)
    at org.postgresql.Driver.makeConnection(Driver.java:450)
    at org.postgresql.Driver.connect(Driver.java:252)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:64)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:55)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:210)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:346)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:298)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:279)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:202)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3334328075204474:8)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3334328075204474:51)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw$$iw.<init>(command-3334328075204474:53)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw.<init>(command-3334328075204474:55)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw.<init>(command-3334328075204474:57)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw.<init>(command-3334328075204474:59)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read.<init>(command-3334328075204474:61)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$.<init>(command-3334328075204474:65)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$.<clinit>(command-3334328075204474)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$eval$.$print$lzycompute(<notebook>:7)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$eval$.$print(<notebook>:6)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
    at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
    at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
    at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
    at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:199)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply$mcV$sp(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:587)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:542)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$7.apply(DriverLocal.scala:324)
    at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$7.apply(DriverLocal.scala:304)
    at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:235)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:230)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:45)
    at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:268)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:45)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:304)
    at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:589)
    at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:589)
    at scala.util.Try$.apply(Try.scala:192)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:584)
    at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:475)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:542)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:381)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:328)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:215)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: connect timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.postgresql.core.PGStream.<init>(PGStream.java:68)
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:144)
    at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
    at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:194)
    at org.postgresql.Driver.makeConnection(Driver.java:450)
    at org.postgresql.Driver.connect(Driver.java:252)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:64)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:55)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:210)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:346)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:298)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:279)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:202)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-3334328075204474:8)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-3334328075204474:51)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw$$iw.<init>(command-3334328075204474:53)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw$$iw.<init>(command-3334328075204474:55)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw$$iw.<init>(command-3334328075204474:57)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$$iw.<init>(command-3334328075204474:59)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read.<init>(command-3334328075204474:61)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$.<init>(command-3334328075204474:65)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$read$.<clinit>(command-3334328075204474)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$eval$.$print$lzycompute(<notebook>:7)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$eval$.$print(<notebook>:6)
    at lined9bdaa60f31e4f44a370d2ec7ae9793627.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
    at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
    at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
    at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
    at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:199)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply$mcV$sp(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal$$anonfun$repl$1.apply(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:587)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:542)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:189)
    at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$7.apply(DriverLocal.scala:324)
    at com.databricks.backend.daemon.driver.DriverLocal$$anonfun$execute$7.apply(DriverLocal.scala:304)
    at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:235)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:230)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:45)
    at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:268)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:45)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:304)
    at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:589)
    at com.databricks.backend.daemon.driver.DriverWrapper$$anonfun$tryExecutingCommand$2.apply(DriverWrapper.scala:589)
    at scala.util.Try$.apply(Try.scala:192)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:584)
    at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:475)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:542)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:381)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:328)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:215)
    at java.lang.Thread.run(Thread.java:748)

2 个答案:

答案 0 :(得分:0)

我从未在连接URL上提供过用户名和密码,因此我不确定它是否有效。通常,它被指定为额外参数。检查Spark Docs,它是这样指定的(在Scala中):

val jdbcDF = spark.read
  .format("jdbc")
  .option("url", "jdbc:postgresql:dbserver")
  .option("dbtable", "schema.tablename")
  .option("user", "username")
  .option("password", "password")
  .load()

参考:https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html

答案 1 :(得分:0)

这是公司内部的问题,与代码无关