下面的数据框包含有关 启动程序 的信息,其中只有一列日期:
indate
2016-12-19 12:16:00
2016-12-19 12:21:00
2016-12-20 12:32:00
2016-12-20 12:34:00
2016-12-20 12:40:00
2016-12-21 13:47:01
2016-12-21 14:27:01
2016-12-21 14:43:00
2016-12-21 15:02:00
2016-12-22 15:16:00
2016-12-22 15:22:00
2016-12-22 15:25:00
2016-12-22 15:22:00
2016-12-22 15:25:00
........
我想汇总以获取 每天的发射次数 :
indate number of launchings
2016-12-19 2
2016-12-20 3
2016-12-21 4
2016-12-22 5
...
然后还可以获取发布日期的星期,发布日期和编号。发射数量:
week day number of launchings
2016-12-19 - 2016-12-25 Mo 2
2016-12-19 - 2016-12-25 Tu 3
2016-12-19 - 2016-12-25 We 4
2016-12-19 - 2016-12-25 Th 5
2016-12-19 - 2016-12-25 Fr n
2016-12-19 - 2016-12-25 Su n
2016-12-19 - 2016-12-25 Sa n
2016-12-26 - 2017-01-01 Mo n
2016-12-26 - 2017-01-01 Tu n
2016-12-26 - 2017-01-01 We n
....
我没有在Pandas中找到任何特殊方法来执行此操作。
答案 0 :(得分:2)
首先将resample
按天使用size
,然后按strftime
提取天的名称,最后持续数周,使用transform
first
每星期重采样和last
值:
df1 = df.resample('d', on='indate').size().reset_index(name='number of launchings')
df1['day'] = df1['indate'].dt.strftime('%a')
g = df1.resample('W', on='indate')['indate']
df1['week'] = g.transform('first').dt.strftime('%Y-%m-%d') + ' - ' +
g.transform('last').dt.strftime('%Y-%m-%d')
另一种解决方案是使用Grouper
:
df1 = (df.groupby(pd.Grouper(freq='d', key='indate'))
.size()
.reset_index(name='number of launchings'))
df1['day'] = df1['indate'].dt.strftime('%a')
g = df1.groupby(pd.Grouper(freq='W', key='indate'))['indate']
df1['week'] = (g.transform('first').dt.strftime('%Y-%m-%d') + ' - ' +
g.transform('last').dt.strftime('%Y-%m-%d'))
print (df1)
indate number of launchings day week
0 2016-12-19 2 Mon 2016-12-19 - 2016-12-25
1 2016-12-20 3 Tue 2016-12-19 - 2016-12-25
2 2016-12-21 4 Wed 2016-12-19 - 2016-12-25
3 2016-12-22 5 Thu 2016-12-19 - 2016-12-25
4 2016-12-23 1 Fri 2016-12-19 - 2016-12-25
5 2016-12-24 1 Sat 2016-12-19 - 2016-12-25
6 2016-12-25 1 Sun 2016-12-19 - 2016-12-25
7 2016-12-26 1 Mon 2016-12-26 - 2017-01-01
8 2016-12-27 1 Tue 2016-12-26 - 2017-01-01
9 2016-12-28 1 Wed 2016-12-26 - 2017-01-01
10 2016-12-29 1 Thu 2016-12-26 - 2017-01-01
11 2016-12-30 1 Fri 2016-12-26 - 2017-01-01
12 2016-12-31 1 Sat 2016-12-26 - 2017-01-01
13 2017-01-01 1 Sun 2016-12-26 - 2017-01-01
样本数据:
print (df)
indate
0 2016-12-19 12:16:00
1 2016-12-19 12:21:00
2 2016-12-20 12:32:00
3 2016-12-20 12:34:00
4 2016-12-20 12:40:00
5 2016-12-21 13:47:01
6 2016-12-21 14:27:01
7 2016-12-21 14:43:00
8 2016-12-21 15:02:00
9 2016-12-22 15:16:00
10 2016-12-22 15:22:00
11 2016-12-22 15:25:00
12 2016-12-22 15:22:00
13 2016-12-22 15:25:00
14 2016-12-23 12:16:00
15 2016-12-24 12:21:00
16 2016-12-25 12:32:00
17 2016-12-26 12:34:00
18 2016-12-27 12:40:00
19 2016-12-28 13:47:01
20 2016-12-29 14:27:01
21 2016-12-30 14:43:00
22 2016-12-31 15:02:00
23 2017-01-01 15:16:00