我有一个包含销售数据的数据框
Order ID Order Date Order Priority Order Quantity Sales
928.0 1/1/2009 High 32.0 180.36
10369.0 1/2/2009 Low 43.0 4,083.19
10144.0 1/2/2009 Critical 16.0 137.63
32323.0 1/1/2009 Not Specified 9.0 872.48
48353.0 1/2/2009 Critical 3.0 124.81
51008.0 1/3/2009 Critical 15.0 85.56
26756.0 1/2/2009 Critical 43.0 614.8
18144.0 1/2/2009 Low 4.0 1,239.06
22912.0 1/2/2009 Low 32.0 4,902.38
...
我想按日期(从最旧到最新)和销售(从大到小)对值进行排序。我在PyCharm Edu 3.5.1(python 2.7)中编写了这段代码:
df = pd.read_csv('sales.csv', header=0)
df['Order Date'] = pd.to_datetime(df['Order Date'])
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print df.head(10)
输出:
Order ID Order Date Order Priority Order Quantity Sales
32323.0 2009-01-01 Not Specified 9.0 872.48
928.0 2009-01-01 High 32.0 180.36
26756.0 2009-01-02 Critical 43.0 614.8
22912.0 2009-01-02 Low 32.0 4,902.38
10369.0 2009-01-02 Low 43.0 4,083.19
10144.0 2009-01-02 Critical 16.0 137.63
48353.0 2009-01-02 Critical 3.0 124.81
18144.0 2009-01-02 Low 4.0 1,239.06
29376.0 2009-01-03 Not Specified 4.0 896.49
...
'订单日期'列已正确排序,但'销售'列未按预期排序。对于1000分隔符,似乎PyCharm忽略了值。我在这里错过了什么吗?
答案 0 :(得分:1)
使用带有参数thousands
的{{3}}来移除浮点数中的,
,将parse_dates
用于将列转换为日期时间,因为列Sales
的值读为{ {1}} S:
string
另一种解决方案是使用read_csv
+ replace
或astype
:
df = pd.read_csv('sales.csv', thousands=',', parse_dates=['Order Date'])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
0 928.0 2009-01-01 High 32.0 180.36
1 10369.0 2009-01-02 Low 43.0 4083.19
2 10144.0 2009-01-02 Critical 16.0 137.63
3 32323.0 2009-01-01 Not Specified 9.0 872.48
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
6 26756.0 2009-01-02 Critical 43.0 614.80
7 18144.0 2009-01-02 Low 4.0 1239.06
8 22912.0 2009-01-02 Low 32.0 4902.38
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56