在pandas系列中排序日期数据

时间:2018-01-29 12:56:35

标签: python python-3.x pandas sorting

数据如下所示:

0        Thursday
1        Thursday
2        Thursday
3        Thursday
etc, etc

我的代码:

import pandas as pd
data_file = pd.read_csv('./data/Chicago-2016-Summary.csv')
days = data_file['day_of_week']

order = ["Monday","Tuesday","Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]

sorted(days, key=lambda x: order.index(x[0]))
print(days)

导致错误:

  

ValueError:'T'不在列表中

我尝试排序并得到此错误,但我不知道这意味着什么。

我只想在星期一到星期日对数据进行排序,以便我可以进行一些可视化。有什么建议吗?

1 个答案:

答案 0 :(得分:3)

您可以使用pandas'Categorical数据类型:

order = ["Monday","Tuesday","Wednesday", "Thursday", "Friday", "Saturday", "Sunday"] 
data_file['day_of_week'] = pd.Categorical(data_file['day_of_week'], categories=order, ordered=True)
data_file.sort_values(by='day_of_week', inplace=True)

在您的示例中,请注意指定

days = data_file['day_of_week']

您正在data_file框架内创建该列(系列)的视图。您可能想要使用days = data_file['day_of_week'].copy()。或者,只需在DataFrame中工作,如上所述。