我编写了一个脚本,该脚本首先运行SQL查询以从Redshift(通过Databricks)获取数据。然后,我想在熊猫数据框中显示它。问题在于以某种方式删除/不显示列的名称。为什么?
#SQL Query
query = """
SELECT * FROM table1 limit 1;
"""
# Execute the query
try:
cursor.execute(query)
except OperationalError as msg:
print ("Command skipped: ")
#Fetch all rows from the result
rows = cursor.fetchall()
# Convert into a Pandas Dataframe
df = pd.DataFrame( [[ij for ij in i] for i in rows] )
df.head()
如您所见,列名称变为数字(黄色)。目的是显示列名称1:Customer_id,列名称2:购买,列名称3:Product_id等。
感谢您的帮助。谢谢!
答案 0 :(得分:0)
根据@Chris的建议,您可以通过以下方式使用pd.read_sql
:-
query = """SELECT * FROM table1 limit 1;"""
connection = psycopg2.connect(user = 'your_username',
password = 'password',
host = 'host_ip',
port = 5432,
database = 'db_name')
data = pd.read_sql(sql=query, con=connection)
现在,当您打印data
时,它也会显示列名!