我正在编写一个程序,将excel文件加载到数据库中,但是在使用pandas加载数据时,由于将日期读取为字符串,所以给出了错误,尽管openpyxl可以正常工作
使用熊猫编码
try:
cursor, connection =Util.getConnection(Conf['SRC'])
SheetData=tuple(ws.values)
columNameList=list(SheetData[0])
DataListInp=list(SheetData[1:])
#DataList = [tuple(map(lambda i: str.replace(str(i),'\xa0',"") if isinstance(i, str) else i, tup)) for tup in DataListInp]
DataList = [tuple(map(lambda i: str.replace(str.replace(str.replace(str(i),'_x000D_'," "),'\xa0'," "),'\n'," ") if isinstance(i, str) else i, tup)) for tup in DataListInp]
paramDictionary=[dict(zip(columNameList,row)) for row in DataList]
query='insert into '+ws.title+'('
columnstring=",".join(str(x) for x in columNameList)
paramString=",".join(':'+str(x) for x in columNameList)
finalquery=query+columnstring+') values( '+paramString+')'
cursor.prepare(finalquery)
cursor.executemany(None, paramDictionary)
connection.commit()
except Exception as e:
print(e)
traceback.print_exc()
finally:
if cursor is not None:
cursor.close()
connection.close()
if __name__ == '__main__':
if len(sys.argv) != 3:
logging.debug("No of Parameter should be 2 . 1= file_name, 2= sheets")
exit()
i_runParams=sys.argv
file_name=i_runParams[1]
sheets=i_runParams[2]
print(file_name)
print(sheets)
wb=openpyxl.load_workbook(file_name,data_only=True)
sheetList=sheets.split(',')
print(sheetList)
for sheet in sheetList:
ws=wb[sheet]
print(ws.title,' Sheet Loading Start')
loadSheets(ws)
print(ws.title,'Sheet Loading Finished')
使用OpenPyxl编写代码:
{{1}}
问题是熊猫正在读取Excel中的日期字段,如下所示 'CREATE_DATE':'2019-04-04 00:00:00','UPDATE_DATE':'2019-04-04 00:00:00',
尽管OpenPyxl读为
'CREATE_DATE':datetime.datetime(2019,4,4,0,0),'UPDATE_DATE':datetime.datetime(2019,4,4,0,0)
我们如何从熊猫那里得到相同的结果?无法更改列的数据类型,因为我们不知道哪列将是动态的日期
答案 0 :(得分:0)
您能否添加一条try语句,该语句将尝试将字段的加载时间转换为日期时间,并在失败时继续?
import pandas as pd
Date = '01-02-2019'
NotDate = 'Not Date'
for i in [Date, NotDate]:
try:
if pd.to_datetime(i):
print(i + ' is a date')
except:
print(i + ' is not a date')
输出:
01-02-2019 is a date
Not Date is not a date