使用Python Connector我可以查询Snowflake:
import snowflake.connector
# Gets the version
ctx = snowflake.connector.connect(
user=USER,
password=PASSWORD,
account=ACCOUNT,
authenticator='https://XXXX.okta.com',
)
ctx.cursor().execute('USE warehouse MY_WH')
ctx.cursor().execute('USE MYDB.MYSCHEMA')
query = '''
select * from MYDB.MYSCHEMA.MYTABLE
LIMIT 10;
'''
cur = ctx.cursor().execute(query)
结果为snowflake.connector.cursor.SnowflakeCursor
。如何将其转换为Pandas DataFrame?
答案 0 :(得分:3)
您可以将DataFrame.from_records()
或pandas.read_sql()
与snowflake-sqlalchemy一起使用。雪花炼金术选项具有更简单的API
pd.DataFrame.from_records(iter(cur), columns=[x[0] for x in cur.description])
将返回一个带有从SQL结果中获取的正确列名的DataFrame。 iter(cur)
会将光标转换为迭代器,cur.description
给出列的名称和类型。
因此完整的代码将是
import snowflake.connector
import pandas as pd
# Gets the version
ctx = snowflake.connector.connect(
user=USER,
password=PASSWORD,
account=ACCOUNT,
authenticator='https://XXXX.okta.com',
)
ctx.cursor().execute('USE warehouse MY_WH')
ctx.cursor().execute('USE MYDB.MYSCHEMA')
query = '''
select * from MYDB.MYSCHEMA.MYTABLE
LIMIT 10;
'''
cur = ctx.cursor().execute(query)
df = pd.DataFrame.from_records(iter(cur), columns=[x[0] for x in cur.description])
如果您更喜欢使用pandas.read_sql
,则可以
import pandas as pd
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
url = URL(
account = 'xxxx',
user = 'xxxx',
password = 'xxxx',
database = 'xxx',
schema = 'xxxx',
warehouse = 'xxx',
role='xxxxx',
authenticator='https://xxxxx.okta.com',
)
engine = create_engine(url)
connection = engine.connect()
query = '''
select * from MYDB.MYSCHEMA.MYTABLE
LIMIT 10;
'''
df = pd.read_sql(query, connection)
答案 1 :(得分:3)
现在有一种方法.fetch_pandas.all()
,不再需要SQL Alchemy。
请注意,您需要通过执行以下操作为大熊猫安装雪花.connector
pip install snowflake-connector-python[pandas]
完整文档here
import pandas as pd
import snowflake.connector
conn = snowflake.connector.connect(
user="xxx",
password="xxx",
account="xxx",
warehouse="xxx",
database="MYDB",
schema="MYSCHEMA"
)
cur = conn.cursor()
# Execute a statement that will generate a result set.
sql = "select * from MYTABLE limit 10"
cur.execute(sql)
# Fetch the result set from the cursor and deliver it as the Pandas DataFrame.
df = cur.fetch_pandas_all()