每当我查询我的SQL数据库以创建一个pandas df时,我总是得到行数列。我试过了:
df.drop(df.columns[[0]], axis=1, inplace=True)
但它会忽略该列,并删除我在查询中请求的第一列数据。所以我留下了从0开始的一列行号。无论我使用SQLAlchemy还是只使用Pyodbc,都会发生这种情况。
我的数据框看起来像
| action| employee_number|part_number|time_stamp
0 |Add/Sub| 001841 |F151519FGL |2015-10-01
1 |Remove | 001997 |P088001DFL |2015-10-01
2 |Add/Sub| 001243 |-F151517DDL|2015-10-01
........................................................
50 |Add/Sub| 001458 |-1A0021049 |2015-10-01
我想将0和随后的所有数字删除到50+(特别是因为这都是写入Excel,额外的行数似乎是多余的)。
供参考,我的完整代码如下:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import scoped_session,sessionmaker
from sqlalchemy import (Column, Integer, String, Boolean, ForeignKey, DateTime, Sequence, Float)
from sqlalchemy import create_engine
import pandas as pd
import openpyxl
##Needed in order to get SQL query properly formatted ***SHOULD BE CHANGED LATER****
pd.core.format.header_style = None
pd.core.format.number_format = None
#######################################################################################
##Takes sqlalchemy query and a list of columns, returns a dataframe.
def data_frame(query, columns):
def make_row(x):
return dict([(c, getattr(x, c)) for c in columns])
return pd.DataFrame([make_row(x) for x in query])
############################################################################################
#####Creates Engine, Named Session(Currently Not Scoped), Declarative Base############################
engine = create_engine('mssql+pyodbc://u:pass@Server/TableName?driver=SQL Server', echo=False)
Session = sessionmaker(bind=engine)
session = Session()
Base = declarative_base()
###########Class Name, __tablename__ is equal to table's name in db but we are specifying all column names & formats with at least one primary key######################
class Tranv(Base):
__tablename__ = "Transactions"
part_number = Column(String(20), primary_key=True)
time_stamp = Column(String(20))
employee_number = Column(String(6))
action = Column(String(100))
###Creating a variable with session query for specific Class Name with filter (not filter_by which allows for extra operators)##############
newvarv = session.query(Tranv).filter_by(employee_number='001841').filter_by(time_stamp='2015-10-01 10:49:53.230')
###Uses data_frame function with input of session query variable name, and extra within [] the specified class must be included###############
dfx = data_frame(newvarv, [c.name for c in Tranv.__table__.columns])
##dfx.drop(dfx.columns[[0]], axis=1, inplace=True)
###Where I'm writing file to
writer = pd.ExcelWriter('C:\\Users\\grice\\Desktop\\Auto_Scrap_Report\\testy.xlsx')
###Formatting of any date times
writer.date_format = None
writer.datetime_format = None
###Actually writing the data to the .xlsx and saving
dfx.to_excel(writer, sheet_name='Sheet1')
writer.save()