我是 Redshift 的新手,所以我需要一些帮助。
我正在尝试使用 jdbc 库(不是 databricks 一个,因为该库对 scala 2.12 无效)将 pyspark 数据帧写入数据库,但出现权限错误。
代码:
df.write.format('jdbc').options(
url='jdbc:redshift://server:5439/db',
driver='com.amazon.redshift.jdbc42.Driver',
dbtable=new_table,
user='user',
password='pass').mode('append').save()
错误:
21/02/24 08:42:42 ERROR TaskSetManager: Task 0 in stage 1.0 failed 1 times; aborting job
Traceback (most recent call last):
File "redshift_spark.py", line 77, in <module>
.mode('append').save()
File "\venv\lib\site-packages\pyspark\sql\readwriter.py", line 825, in save
self._jwrite.save()
File "C:\apps\spark-3.0.1-bin-hadoop3.2\python\lib\py4j-0.10.9-src.zip\py4j\java_gateway.py", line 1305, in __call__
File "\venv\lib\site-packages\pyspark\sql\utils.py", line 128, in deco
return f(*a, **kw)
File "C:\apps\spark-3.0.1-bin-hadoop3.2\python\lib\py4j-0.10.9-src.zip\py4j\protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o51.save.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, ip-192-168-1-132.eu-west-1.compute.internal, executor driver): java.sql.SQLException: [Amazon](500310) Invalid operation: The session is read-only;
at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException(Unknown Source)
好像我没有阅读权限,但是我在哪里需要这个权限?我尝试使用 postgres 库 psycopg2
和 postgres jdbc 驱动程序 org.postgresql.Driver
访问数据库,但我没有问题。