如何使用SQLAlchemy连接到Amazon Redshift中的集群?

时间:2016-01-26 00:17:34

标签: python postgresql amazon-web-services sqlalchemy amazon-redshift

在Amazon Redshift的Getting Started Guide中,提到您可以利用与PostgreSQL兼容的SQL客户端工具连接到您的Amazon Redshift群集。

在本教程中,他们使用SQL Workbench / J客户端,但我想使用python(特别是SQLAlchemy)。我找到了related question,但问题是它没有进入详细信息或连接到Redshift Cluster的python脚本。

我已经能够通过SQL Workbench / J连接到集群,因为我有JDBC URL,以及我的用户名和密码,但我不确定如何连接SQLAlchemy。

根据此documentation,我尝试了以下内容:

from sqlalchemy import create_engine
engine = create_engine('jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy')

错误:

Could not parse rfc1738 URL from string 'jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy'

5 个答案:

答案 0 :(得分:2)

我不认为SQL Alchemy“本地”知道Redshift。您需要更改JDBC“URL”字符串以使用> encodeURIComponent('= ') '%3D%20'

postgres

或者,您可能希望尝试使用sqlalchemy-redshift使用他们提供的说明。

答案 1 :(得分:2)

我遇到了完全相同的问题,然后我记得要包含我的Redshift凭证:

eng = create_engine('postgres://[LOGIN]:[PWORD]@shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy

答案 2 :(得分:0)

sqlalchemy-redshift对我有用,但经过几天的研究 包(python3.4):

  

SQLAlchemy == 1.0.14 sqlalchemy-redshift == 0.5.0 psycopg2 == 2.6.2

首先,我检查了一下,我的查询是工作台(http://www.sql-workbench.net),然后我强制它在sqlalchemy中工作(这个https://stackoverflow.com/a/33438115/2837890有助于知道auto_commit或session.commit()必须定):

db_credentials = (
'redshift+psycopg2://{p[redshift_user]}:{p[redshift_password]}@{p[redshift_host]}:{p[redshift_port]}/{p[redshift_database]}'
    .format(p=config['Amazon_Redshift_parameters']))
engine = create_engine(db_credentials, connect_args={'sslmode': 'prefer'})
connection = engine.connect()
result = connection.execute(text(
    "COPY assets FROM 's3://xx/xx/hello.csv' WITH CREDENTIALS "
    "'aws_access_key_id=xxx_id;aws_secret_access_key=xxx'"
    " FORMAT csv DELIMITER ',' IGNOREHEADER 1 ENCODING UTF8;").execution_options(autocommit=True))
result = connection.execute("select * from assets;")
print(result, type(result))
print(result.rowcount)
connection.close()

在那之后,我被迫工作sqlalchemy_redshift CopyCommand也许是不好的方式,看起来有点棘手:

import sqlalchemy as sa
tbl2 = sa.Table(TableAssets, sa.MetaData())
copy = dialect_rs.CopyCommand(
    assets,
    data_location='s3://xx/xx/hello.csv',
    access_key_id=access_key_id,
    secret_access_key=secret_access_key,
    truncate_columns=True,
    delimiter=',',
    format='CSV',
    ignore_header=1,
    # empty_as_null=True,
    # blanks_as_null=True,
)

print(str(copy.compile(dialect=RedshiftDialect(), compile_kwargs={'literal_binds': True})))
print(dir(copy))
connection = engine.connect()
connection.execute(copy.execution_options(autocommit=True))
connection.close()

我们使用sqlalchemy制作,执行查询,但CopyCommand的comine查询除外。我没有看到一些利润:(。

答案 3 :(得分:0)

以下内容适用于Databricks,适用于各种SQL

  import sqlalchemy as SA
  import psycopg2
  host = 'your_host_url'
  username = 'your_user'
  password = 'your_passw'
  port = 5439
  url = "{d}+{driver}://{u}:{p}@{h}:{port}/{db}".\
            format(d="redshift",
            driver='psycopg2',
            u=username,
            p=password,
            h=host,
            port=port,
            db=db)
  engine = SA.create_engine(url)
  cnn = engine.connect()

  strSQL = "your_SQL ..."
  try:
      cnn.execute(strSQL)
  except:
      raise

答案 4 :(得分:0)

import sqlalchemy as db
engine = db.create_engine('postgres://username:password@url:5439/db_name')

这对我有用