使用这样的DataFrame:
time location
1 A
1 A
2 B
4 A
9 A
12 B
12 B
12 B
18 A
我可以通过执行以下cut和value_counts操作来计算时间段内出现的次数:
d = {'time': [1,1,2,4,9,12,12,12,18], 'location': ['A','A','B','A','A','B','B','B','A']}
df = pd.DataFrame(d)
time_bins = np.arange(0, 100, 10)
cut_frame = pd.cut(df.time, bins=time_bins)
counts = pd.value_counts(cut_frame,sort=False)
count_frame = pd.DataFrame(counts)
count_frame.index.name = 'time_window'
生成的DataFrame如下所示:
time_window time
(0, 10] 5
(10, 20] 4
如何通过location
系列进一步细分这个以获得具有MultiIndex的类似内容?
location time_window
A (0, 10] 4
(10, 20] 1
B (0, 10] 1
(10, 20] 3
还是这个?
time_window location time
(0, 10] A 4
(0, 10] B 1
(10, 20] A 1
(10, 20] B 3
答案 0 :(得分:1)
您可以将cut_frame
附加到原始df
,然后应用groupby
:
df["time_window"] = cut_frame
df.groupby(["location", "time_window"]).count().dropna()