我有一个Pandas dataFrame,其列为Date:
ID Amount raw-Date ZIP transaction-ID Date flag
749 145552 $100.00 1/15/2018 27614-7901 1342-P0192-F43 1/15/2018 1.0
1307 145552 $100.00 3/15/2018 27614-7901 1342-P0192-F43 3/15/2018 1.0
1672 145552 $100.00 2/15/2018 27614-7901 1342-P0192-F43 2/15/2018 1.0
3508 145552 $100.00 4/15/2018 27614-7901 1342-P0192-F43 4/15/2018 1.0
4144 145552 $250.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4145 145552 $100.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4787 145552 $100.00 5/15/2018 27614-7901 1342-P0192-F43 5/15/2018 1.0
8350 145552 $212.44 12/21/2018 27614-7901 1342-P0192-F43 12/21/2018 1.0
当我按日期列对它们进行排序时,即使用data.sort_values('Date')
,我得到:
ID Amount raw-Date ZIP transaction-ID Date flag
749 145552 $100.00 1/15/2018 27614-7901 1342-P0192-F43 1/15/2018 1.0
8350 145552 $212.44 12/21/2018 27614-7901 1342-P0192-F43 12/21/2018 1.0
1672 145552 $100.00 2/15/2018 27614-7901 1342-P0192-F43 2/15/2018 1.0
1307 145552 $100.00 3/15/2018 27614-7901 1342-P0192-F43 3/15/2018 1.0
3508 145552 $100.00 4/15/2018 27614-7901 1342-P0192-F43 4/15/2018 1.0
4144 145552 $250.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4145 145552 $100.00 4/24/2018 27614-7901 1234-O8910-B32 4/24/2018 1.0
4787 145552 $100.00 5/15/2018 27614-7901 1342-P0192-F43 5/15/2018 1.0
其中显然将日期排序为字符串。我尝试了pd.to_datetime(data['Date'])
,并再次得到了相同的排序结果:
ID Amount raw-Date ZIP Appeal ID Date flag
749 145552 $100.00 1/15/2018 27614-7901 1342-P0192-F43 2018-01-15 1.0
8350 145552 $212.44 12/21/2018 27614-7901 1342-P0192-F43 2018-12-21 1.0
1672 145552 $100.00 2/15/2018 27614-7901 1342-P0192-F43 2018-02-15 1.0
1307 145552 $100.00 3/15/2018 27614-7901 1342-P0192-F43 2018-03-15 1.0
3508 145552 $100.00 4/15/2018 27614-7901 1342-P0192-F43 2018-04-15 1.0
4144 145552 $250.00 4/24/2018 27614-7901 1234-O8910-B32 2018-04-24 1.0
4145 145552 $100.00 4/24/2018 27614-7901 1234-O8910-B32 2018-04-24 1.0
4787 145552 $100.00 5/15/2018 27614-7901 1342-P0192-F43 2018-05-15 1.0
感谢您的帮助。
答案 0 :(得分:2)
您的数据具有重复的列名Date
,因此不建议使用。在这种情况下:df['Date']
将给出一个包含两列的数据框,而pd.to_datetime(df['Date'])
将失败。
也就是说,您可以执行apply
:
df['Date'] = df['Date'].apply(pd.to_datetime)
之后,df.Date.dtypes
会给出:
Date datetime64[ns]
Date datetime64[ns]
dtype: object