在pandas和matplotlib中将24小时格式转换为12小时

时间:2017-09-08 08:30:53

标签: python python-2.7 pandas matplotlib

数据集:

                  starttime   User Type
0         7/1/2015 00:00:03  Subscriber
1         7/1/2015 00:00:06  Subscriber
2         7/1/2015 00:00:17  Subscriber
3         7/1/2015 00:00:23  Subscriber
4         7/1/2015 00:00:44  Subscriber
5         7/1/2015 00:01:00  Subscriber
6         7/1/2015 00:01:03  Subscriber
7         7/1/2015 00:01:06  Subscriber
8         7/1/2015 00:01:25    Customer
9         7/1/2015 00:01:41  Subscriber
10        7/1/2015 00:01:50    Customer
11        7/1/2015 00:01:58  Subscriber
12        7/1/2015 00:02:06  Subscriber
13        7/1/2015 00:02:07  Subscriber
14        7/1/2015 00:02:26  Subscriber
15        7/1/2015 00:02:26  Subscriber
16        7/1/2015 00:02:35  Subscriber
17        7/1/2015 00:02:43    Customer
18        7/1/2015 00:02:47    Customer
19        7/1/2015 00:02:47  Subscriber
20        7/1/2015 00:03:05  Subscriber
21        7/1/2015 00:03:16    Customer
22        7/1/2015 00:03:27  Subscriber
23        7/1/2015 00:03:34  Subscriber
24        7/1/2015 00:03:48  Subscriber
25        7/1/2015 00:03:56  Subscriber
26        7/1/2015 00:03:57  Subscriber
27        7/1/2015 00:03:58    Customer
28        7/1/2015 00:04:03  Subscriber
29        7/1/2015 00:04:17  Subscriber
...                     ...         ...
1085646  7/31/2015 23:57:25  Subscriber
1085647  7/31/2015 23:57:29    Customer
1085648  7/31/2015 23:57:32  Subscriber
1085649  7/31/2015 23:57:33  Subscriber
1085650  7/31/2015 23:57:44  Subscriber
1085651  7/31/2015 23:57:54  Subscriber
1085652  7/31/2015 23:58:03  Subscriber
1085653  7/31/2015 23:58:08  Subscriber
1085654  7/31/2015 23:58:12    Customer
1085655  7/31/2015 23:58:15  Subscriber
1085656  7/31/2015 23:58:18    Customer
1085657  7/31/2015 23:58:24  Subscriber
1085658  7/31/2015 23:58:27  Subscriber
1085659  7/31/2015 23:58:42  Subscriber
1085660  7/31/2015 23:58:43  Subscriber
1085661  7/31/2015 23:58:51    Customer
1085662  7/31/2015 23:58:53  Subscriber
1085663  7/31/2015 23:58:58  Subscriber
1085664  7/31/2015 23:59:04  Subscriber
1085665  7/31/2015 23:59:10  Subscriber
1085666  7/31/2015 23:59:24  Subscriber
1085667  7/31/2015 23:59:23    Customer
1085668  7/31/2015 23:59:24  Subscriber
1085669  7/31/2015 23:59:24  Subscriber
1085670  7/31/2015 23:59:38  Subscriber
1085671  7/31/2015 23:59:40  Subscriber
1085672  7/31/2015 23:59:41  Subscriber
1085673  7/31/2015 23:59:42    Customer
1085674  7/31/2015 23:59:56  Subscriber
1085675  7/31/2015 23:59:59  Subscriber

问题

创建一个pandas DataFrame,其中包含当天每小时的用户类型的游乐设施数量。使用starttime来确定每个骑行的小时。

输出

User    Customer    Subscriber
Type
start
time    
  0     2464    9259
  1     1377    5042
  2     871     2882
  3     597     1755
  4     373     1691
  5     444     5726
  6     1098    22982
  7     2094    45393
  8     4159    78258
  9     5973    61062
  10    8330    36497
  11    11396   35765
  12    13039   41474
  13    13510   42711
  14    14497   42537
  15    15944   46570
  16    15644   58096
  17    16203   90153
  18    14812   92357
  19    12309   67654
  20    9237    45756
  21    6414    31312
  22    5503    24220
  23    4074    16162

这是正确的 图表

而不是

时间为12小时格式

代码

def a9(rides):
    rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True)
    df = pd.crosstab(rides.starttime.dt.hour, rides['User Type'])
    return df

如何以12小时格式而不是24小时在x轴上显示时间。

1 个答案:

答案 0 :(得分:1)

您可以使用strftime - http://strftime.org/,但为了正确排序,需要有序分类:

rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True)
cats = ['12 AM', '01 AM', '02 AM', '03 AM', '04 AM', '05 AM', '06 AM', '07 AM', '08 AM', '09 AM', '10 AM', '11 AM', '12 PM', '01 PM', '02 PM', '03 PM', '04 PM', '05 PM', '06 PM', '07 PM', '08 PM', '09 PM', '10 PM', '11 PM']
dates = pd.Categorical(rides.starttime.dt.strftime('%I %p'), categories=cats, ordered=True)
df = pd.crosstab(dates, rides['User Type'])
print (df)
User Type  Customer  Subscriber
row_0                          
12 AM             6          24
01 AM             0           0
02 AM             0           0
03 AM             0           0
04 AM             0           0
05 AM             0           0
06 AM             0           0
07 AM             0           0
08 AM             0           0
09 AM             0           0
10 AM             0           0
11 AM             0           0
12 PM             0           0
01 PM             0           0
02 PM             0           0
03 PM             0           0
04 PM             0           0
05 PM             0           0
06 PM             0           0
07 PM             0           0
08 PM             0           0
09 PM             0           0
10 PM             0           0
11 PM             6          24

类别列表来自:

rng = pd.date_range('2017-04-03 00:00:00', periods=24, freq='H')
df = pd.DataFrame({'starttime': rng})  
cats = df.starttime.dt.strftime('%I %p').tolist()