我正在尝试构建一个python
脚本,该脚本将在接受参数的同时在数据库连接中运行COPY
命令。
数据库:Amazon Redshift,与psycopg2
包连接
COPY命令从Amazon S3中提取数据。
如果我硬编码任何值,该命令工作正常,但如果我添加一个参数,查询将失败。
access_key = 'my_amazon_acccess_key'
secret_key = 'my_amazon_secret_key'
bucketname = 'my_amazon_s3_bucket_name'
filename = 'my_gzipped_file.gz'
代码我正在尝试参数化:
cur.execute("
COPY Schema.tablename FROM 's3://%s/%s' credentials 'aws_access_key_id=%s;aws_secret_access_key=%s' NULL 'NULL' gzip delimiter =',';",
(bucketname, filename, access_key, secret_key))
ProgrammingError: syntax error at or near "my_amazon_s3_bucket_name"
LINE 2: COPY Schema.tablename FROM 's3://'my_amazon_s3_bucket_name'/'my_gzipped_file.gz'...
cur.execute("
COPY Schema.tablename FROM 's3://?/?' credentials 'aws_access_key_id=?;aws_secret_access_key=?' NULL 'NULL' gzip delimiter =',';",
(bucketname, filename, access_key, secret_key))
cur.execute("
COPY Schema.tablename FROM 's3://$1/$2' credentials 'aws_access_key_id=$3;aws_secret_access_key=$4' NULL 'NULL' gzip delimiter =',';",
(bucketname, filename, access_key, secret_key))
InternalError Traceback (most recent call last)
<> in <module>()
1 cur.execute("""
2 COPY Schema.tablename FROM 's3://?/?' credentials ' aws_acces
s_key_id=?;aws_secret_access_key=?' NULL 'NULL' gzip delimiter ',';""",
----> 3 (bucketname, filename, access_key, secret_key))
InternalError: Invalid credentials. Must be of the format: credentials 'aws_acce
ss_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>[;token=<temp
orary-session-token>]'
DETAIL:
-----------------------------------------------
error: Invalid credentials. Must be of the format: credentials 'aws_access_ke
y_id=<access-key-id>;aws_secret_access_key=<secret-access-key>[;token=<temporary
-session-token>]'
code: 8001
context:
query: 95221
location: aws_credentials_parser.cpp:86
process: padbmaster [pid=326]
-----------------------------------------------
我不想硬编码这些参数,但无法找到正确处理这个问题的方法。可以这样做吗?
答案 0 :(得分:0)
您无法在(SQL)字符串中嵌入分隔符;你需要在数据库引擎中使用字符串连接(SQL字符串连接运算符为||
) - 而不是你现在正在做的事情,它假定扩展在将内容传递给数据库引擎进行解析之前。
也就是说,您的查询字符串应包含:
's3://' || %s || '/' || %s
...前缀为s3://
,添加参数中的字符串,/
,参数中的另一个字符串等。当您在SQL字符串中放入%s
时,它&# 39;被视为字面意思,而不是占位符。
在上下文中(并使用不同的,可以说更清晰,可用的引用形式),这可能看起来像:
cur.execute("""
COPY Schema.tablename FROM 's3://' || %(bucketname)s || '/' || %(filename)s
credentials 'aws_access_key_id=' || %(access_key)s ||
';aws_secret_access_key=' || %(secret_key)s
NULL 'NULL' gzip delimiter =',';""",
{'bucketname': bucketname, 'filename': filename, 'access_key': access_key, 'secret_key': secret_key})
答案 1 :(得分:0)
必须使用AsIs
:
from psycopg2.extensions import AsIs
access_key = 'my_amazon_acccess_key'
secret_key = 'my_amazon_secret_key'
bucketname = 'my_amazon_s3_bucket_name'
filename = 'my_gzipped_file.gz'
print cur.mogrify('''
COPY Schema.tablename
FROM 's3://%s/%s'
credentials 'aws_access_key_id=%s;aws_secret_access_key=%s'
NULL 'NULL' gzip delimiter =','
;''',
(AsIs(bucketname), AsIs(filename), AsIs(access_key), AsIs(secret_key))
)
输出:
COPY Schema.tablename
FROM 's3://my_amazon_s3_bucket_name/my_gzipped_file.gz'
credentials 'aws_access_key_id=my_amazon_acccess_key;aws_secret_access_key=my_amazon_secret_key'
NULL 'NULL' gzip delimiter =','
;
现在的问题是COPY
是服务器端命令。它将由运行服务器的用户运行,通常是postgres
,它需要具有该文件的读取权限。使用客户端权限或psql
\copy
psycopg2
copy_from
or copy_expert