在使用雪花连接器的雪花中通过python执行SQL查询时,列名将分别为0、1、2、3。

时间:2020-09-03 17:08:13

标签: sql python-3.x snowflake-cloud-data-platform

我正在从python脚本中执行sql查询,以从Windows 10中的雪花检索数据,但是结果查询缺少列名,并且被0、1、2、3替换,依此类推。在雪花界面中执行查询并下载csv时,文件中提供了各列。我在查询中将列名作为别名传递

下面是代码

def _CONSUMPTION(con):

data2 = con.cursor().execute("""select sd.sales_force_lvl_1_code "Plan-To Code",sd.sales_force_lvl_1_desc "Plan-To Description",pd.matl_code "Product Code",pd.matl_desc "Product Description",pd.ean_upc_code "UPC",dd.fiscal_week_desc "Fiscal Week Description",f.unit_sales_qty "Sales Units",f.incr_units_qty "Incremental Units"
                                    from DW.consumption_fact1 f, DW.market_dim md, DW.matl_dim pd, DW.fiscal_week_dim dd, (select sales_force_lvl_1_code,max(sales_force_lvl_1_desc) sales_force_lvl_1_desc from DW.mv_us_sales_force_dim group by sales_force_lvl_1_code) sd 
                                    where dd.fiscal_week_key = f.fiscal_week_key 
                                    and pd.matl_key = f.matl_key 
                                    and md.market_key = f.market_key 
                                    and sd.sales_force_lvl_1_code = md.curr_sales_force_lvl_1_code 
                                    and dd.fiscal_week_key between (select curr_fy_week_key-6 from DW.curr_date_lkp) and (select curr_fy_week_key-1 from DW.curr_date_lkp)
                                    and f.company_key = 6006 
                                    and (f.unit_sales_qty <> 0 and f.sales_amt <> 0) 
                                    and md.curr_sales_force_lvl_1_code is not null
                                    UNION
                                    select '5000016240' "Plan-To Code", 'AWG TOTAL' "Plan-To Description",pd.matl_code "Product Code",pd.matl_desc "Product Description",pd.ean_upc_code "UPC",dd.fiscal_week_desc "Fiscal Week Description",f.unit_sales_qty "Sales Units",f.incr_units_qty "Incremental Units"
                                    from DW.consumption_fact1 f, DW.market_dim md, DW.matl_dim pd, DW.fiscal_week_dim dd 
                                    where dd.fiscal_week_key = f.fiscal_week_key 
                                    and pd.matl_key = f.matl_key 
                                    and md.market_key = f.market_key 
                                    and dd.fiscal_week_key between (select curr_fy_week_key-6 from DW.curr_date_lkp) and (select curr_fy_week_key-1 from DW.curr_date_lkp)
                                    and f.company_key = 6006 
                                    and (f.unit_sales_qty <> 0 and f.sales_amt <> 0) 
                                    and md.market_code = '20267'""").fetchall()
                            
df = pd.DataFrame(data2)
df.head(5)
df.to_csv('CONSUMPTION.csv',index = False)

3 个答案:

答案 0 :(得分:1)

您似乎尚未在代码中定义列方法来定义数据框。

我的建议是首先在df.columns中添加列方法

有关详细信息,请参见雪花页面

https://docs.snowflake.com/en/user-guide/python-connector-pandas.html

尝试一下

import pandas as pd

def fetch_pandas_old(cur, sql):
    cur.execute(sql)
    rows = 0
    while True:
        dat = cur.fetchmany(50000)
        if not dat:
            break
        df = pd.DataFrame(dat, columns=cur.description)
        rows += df.shape[0]
    print(rows)

答案 1 :(得分:1)

看[查看文档],似乎最简单的方法是使用游标方法chcon -R u:object_r:shell_data_file:s0 /data/local/tmp

.fetch_pandas_all()

enter image description here

或者,如果您要将结果转储为CSV,请按照以下问题进行操作:

query = "SELECT 1 a, 2 b, 'a' c UNION ALL SELECT 7,4,'snow'"
cur = connection.cursor()
cur.execute(query).fetch_pandas_all()

可视化:

enter image description here

答案 2 :(得分:1)

使用 Snowflake 连接器(也适用于 psycopg2 btw)从光标描述中提取列标题并保存在 Pandas df 中的一种好方法如下:


#Create the connection
def connect_snowflake(uname, pword, acct, role_name, whouse, dbase, schema_name):
    conn = snowflake.connector.connect(
    user=uname,
    password=pword,
    account=acct,
    role = role_name,
    warehouse = whouse,
    database = dbase,
    schema = schema_name
    )
    
    cur = conn.cursor()
    
    return conn, cur

然后执行您的查询。 cur.description 对象返回一个元组列表,每个元组的第一个是列名:)

conn, cur = connect_snowflake(username, password, account_name, role, warehouse, database, schema)
cur.execute('select * from my_schema.my_table')
result =cur.fetchall()
# Extract the column names
col_names = []
for elt in cur.description:
    col_names.append(elt[0])
df = pd.DataFrame(result, columns=col_names)
cur.close()
conn.close()