熊猫读取日期作为字符串而不是日期时间

时间:2019-06-26 17:00:11

标签: python-3.x pandas datetime

我正在编写一个程序,将excel文件加载到数据库中,但是在使用pandas加载数据时,由于将日期读取为字符串,所以给出了错误,尽管openpyxl可以正常工作

使用熊猫编码

    try:
        cursor, connection =Util.getConnection(Conf['SRC'])
        SheetData=tuple(ws.values)
        columNameList=list(SheetData[0])
        DataListInp=list(SheetData[1:])
        #DataList = [tuple(map(lambda i: str.replace(str(i),'\xa0',"") if isinstance(i, str) else i, tup)) for tup in DataListInp]
        DataList = [tuple(map(lambda i: str.replace(str.replace(str.replace(str(i),'_x000D_'," "),'\xa0'," "),'\n'," ") if isinstance(i, str) else i, tup)) for tup in DataListInp]
        paramDictionary=[dict(zip(columNameList,row)) for row in DataList]
        query='insert into '+ws.title+'('
        columnstring=",".join(str(x) for x in columNameList)
        paramString=",".join(':'+str(x) for x in columNameList)
        finalquery=query+columnstring+') values( '+paramString+')'
        cursor.prepare(finalquery)
        cursor.executemany(None, paramDictionary)
        connection.commit()
    except Exception as  e:
        print(e)
        traceback.print_exc()
    finally:
        if cursor is not None:
            cursor.close()
            connection.close()


if __name__ == '__main__':
        if len(sys.argv) != 3:
            logging.debug("No of Parameter should be 2 . 1= file_name, 2= sheets")
            exit()
        i_runParams=sys.argv
        file_name=i_runParams[1]
        sheets=i_runParams[2]
        print(file_name)
        print(sheets)
        wb=openpyxl.load_workbook(file_name,data_only=True)
        sheetList=sheets.split(',')
        print(sheetList)
        for sheet in sheetList:
            ws=wb[sheet]
            print(ws.title,' Sheet Loading Start')
            loadSheets(ws)
            print(ws.title,'Sheet Loading Finished')

使用OpenPyxl编写代码:

{{1}}

问题是熊猫正在读取Excel中的日期字段,如下所示  'CREATE_DATE':'2019-04-04 00:00:00','UPDATE_DATE':'2019-04-04 00:00:00',

尽管OpenPyxl读为

'CREATE_DATE':datetime.datetime(2019,4,4,0,0),'UPDATE_DATE':datetime.datetime(2019,4,4,0,0)

我们如何从熊猫那里得到相同的结果?无法更改列的数据类型,因为我们不知道哪列将是动态的日期

1 个答案:

答案 0 :(得分:0)

您能否添加一条try语句,该语句将尝试将字段的加载时间转换为日期时间,并在失败时继续?

import pandas as pd

Date = '01-02-2019'
NotDate = 'Not Date'
for i in [Date, NotDate]:
    try:

        if pd.to_datetime(i):
            print(i + ' is a date')

    except:
        print(i + ' is not a date')

输出:

01-02-2019 is a date
Not Date is not a date