我正在尝试通过Python作业使用AWS Glue中的PyGreSQL软件包。
我已将车轮文件从此处上传到S3存储桶:
https://pypi.org/project/PyGreSQL/#files
x64的3.6版本
然后在我使用的工作中:
import pg
使用此配置,运行作业时出现以下错误:
WARNING: The directory '/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
2020-08-08T20:22:47.845+02:00
Traceback (most recent call last):
File "/tmp/runscript.py", line 123, in <module>
runpy.run_path(temp_file_path, run_name='__main__')
File "/usr/local/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/usr/local/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/tmp/glue-python-scripts-vbox2q05/postloading3.py", line 7, in <module>
File "/glue/lib/installation/pg.py", line 1436, in <module>
set_query_helpers(_dictiter, _namediter, _namednext, _scalariter)
NameError: name 'set_query_helpers' is not defined
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/runscript.py", line 142, in <module>
raise e_type(e_value).with_traceback(new_stack)
File "/tmp/glue-python-scripts-vbox2q05/postloading3.py", line 7, in <module>
File "/glue/lib/installation/pg.py", line 1436, in <module>
set_query_helpers(_dictiter, _namediter, _namednext, _scalariter)
NameError: name 'set_query_helpers' is not defined
您知道我是否缺少一些要上传的依赖库吗?根据AWS的说法,PyGreSQL与Glue兼容
答案 0 :(得分:0)
它通过添加以下代码而起作用:
select top 33 percent
*
from Sends_And_Opens_BySubscriber
order by
Open_Percentage desc, Last_Open_Date desc, SignUp_Date desc
然后
def get_connection(host):
rs_conn_string = "host=%s port=%s dbname=%s user=%s password=%s" % ("sffg-redshift-c1....", 5439, "dev", "awsuser", "sfg.")
rs_conn = pg.connect(dbname=rs_conn_string)
rs_conn.query("set statement_timeout = 1200000")
return rs_conn
############################MAIN###################################################
con1 = get_connection("aredshift-c1....")
咨询aws胶pdf指南有助于找到使它工作的简单方法