数据集
starttime User Type
0 7/1/2015 00:00:03 Subscriber
1 7/1/2015 00:00:06 Subscriber
2 7/1/2015 00:00:17 Subscriber
3 7/1/2015 00:00:23 Subscriber
4 7/1/2015 00:00:44 Subscriber
5 7/1/2015 00:01:00 Subscriber
6 7/1/2015 00:01:03 Subscriber
7 7/1/2015 00:01:06 Subscriber
8 7/1/2015 00:01:25 Customer
9 7/1/2015 00:01:41 Subscriber
10 7/1/2015 00:01:50 Customer
11 7/1/2015 00:01:58 Subscriber
12 7/1/2015 00:02:06 Subscriber
13 7/1/2015 00:02:07 Subscriber
14 7/1/2015 00:02:26 Subscriber
15 7/1/2015 00:02:26 Subscriber
16 7/1/2015 00:02:35 Subscriber
17 7/1/2015 00:02:43 Customer
18 7/1/2015 00:02:47 Customer
19 7/1/2015 00:02:47 Subscriber
20 7/1/2015 00:03:05 Subscriber
21 7/1/2015 00:03:16 Customer
22 7/1/2015 00:03:27 Subscriber
23 7/1/2015 00:03:34 Subscriber
24 7/1/2015 00:03:48 Subscriber
25 7/1/2015 00:03:56 Subscriber
26 7/1/2015 00:03:57 Subscriber
27 7/1/2015 00:03:58 Customer
28 7/1/2015 00:04:03 Subscriber
29 7/1/2015 00:04:17 Subscriber
... ... ...
1085646 7/31/2015 23:57:25 Subscriber
1085647 7/31/2015 23:57:29 Customer
1085648 7/31/2015 23:57:32 Subscriber
1085649 7/31/2015 23:57:33 Subscriber
1085650 7/31/2015 23:57:44 Subscriber
1085651 7/31/2015 23:57:54 Subscriber
1085652 7/31/2015 23:58:03 Subscriber
1085653 7/31/2015 23:58:08 Subscriber
1085654 7/31/2015 23:58:12 Customer
1085655 7/31/2015 23:58:15 Subscriber
1085656 7/31/2015 23:58:18 Customer
1085657 7/31/2015 23:58:24 Subscriber
1085658 7/31/2015 23:58:27 Subscriber
1085659 7/31/2015 23:58:42 Subscriber
1085660 7/31/2015 23:58:43 Subscriber
1085661 7/31/2015 23:58:51 Customer
1085662 7/31/2015 23:58:53 Subscriber
1085663 7/31/2015 23:58:58 Subscriber
1085664 7/31/2015 23:59:04 Subscriber
1085665 7/31/2015 23:59:10 Subscriber
1085666 7/31/2015 23:59:24 Subscriber
1085667 7/31/2015 23:59:23 Customer
1085668 7/31/2015 23:59:24 Subscriber
1085669 7/31/2015 23:59:24 Subscriber
1085670 7/31/2015 23:59:38 Subscriber
1085671 7/31/2015 23:59:40 Subscriber
1085672 7/31/2015 23:59:41 Subscriber
1085673 7/31/2015 23:59:42 Customer
1085674 7/31/2015 23:59:56 Subscriber
1085675 7/31/2015 23:59:59 Subscriber
问题
创建一个pandas DataFrame,其中包含工作日和周末的小时和用户类型的游乐设施数。使用starttime来确定每个骑行的时间。
输出应该像
User
Type
Hour Customer Subscriber
Weekday 0 124 2194
1 120 1238
2 53 716
3 30 520
.... .... .... ....
Weekend 0 152 1879
1 82 1222
2 45 718
3 34 431
4 29 288
.... .... .... ....
图表应该是这样的
我的代码
def a11(rides):
rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True)
hours_cats = ['12 AM', '01 AM', '02 AM', '03 AM', '04 AM', '05 AM', '06 AM', '07 AM', '08 AM', '09 AM', '10 AM', '11 AM', '12 PM', '01 PM', '02 PM', '03 PM', '04 PM', '05 PM', '06 PM', '07 PM', '08 PM', '09 PM', '10 PM', '11 PM']
dates = pd.Categorical(rides.starttime.dt.strftime('%I %p'), categories=hours_cats, ordered=True)
df = pd.crosstab(dates, rides['User Type'])
我不知道如何在工作日和周末将数据框分开,如问题中所述。
答案 0 :(得分:0)
对于所需的library(data.table)
dcast(data=setDT(dx),formula = X~Y,
fun.aggregate = mean,value.var = "ReactionTime",fill = 0)
# X 2 4 5 6
# 1: 1 0.00 2.395 0.00 0.00
# 2: 2 0.00 2.330 0.00 0.00
# 3: 3 0.00 0.000 3.45 0.00
# 4: 4 1.44 0.000 0.00 0.00
# 5: 5 0.00 0.000 5.44 1.27
# 6: 7 0.00 3.220 0.00 0.00
# 7: 8 3.22 0.000 0.00 0.00
,您需要创建新列:
DataFrame
对于图表输出略有不同 - 更改def a11(rides):
rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True)
rides['Type'] = np.where(rides['starttime'].dt.dayofweek < 5, 'Weekday', 'weekend')
return pd.crosstab([rides['Type'], rides['starttime'].dt.dayofweek], rides['User Type'])
print (a11(rides))
User Type Customer Subscriber
Type starttime
Weekday 2 6 24
4 6 24
并使用第一级MultiIndex选择DataFrame.xs
:
dates