Pandas:value_counts并使用groupby multiindex进行剪切

时间:2018-03-01 03:44:46

标签: python pandas

使用这样的DataFrame:

time    location
1       A
1       A
2       B
4       A
9       A
12      B
12      B
12      B
18      A

我可以通过执行以下cut和value_counts操作来计算时间段内出现的次数:

d = {'time': [1,1,2,4,9,12,12,12,18], 'location': ['A','A','B','A','A','B','B','B','A']}
df = pd.DataFrame(d)
time_bins = np.arange(0, 100, 10)
cut_frame = pd.cut(df.time, bins=time_bins)
counts = pd.value_counts(cut_frame,sort=False)
count_frame = pd.DataFrame(counts)
count_frame.index.name = 'time_window'

生成的DataFrame如下所示:

time_window time
(0, 10]     5
(10, 20]    4

如何通过location系列进一步细分这个以获得具有MultiIndex的类似内容?

location  time_window
A    (0, 10]    4
     (10, 20]   1
B    (0, 10]    1
     (10, 20]   3

还是这个?

time_window     location    time
(0, 10]         A           4
(0, 10]         B           1
(10, 20]        A           1
(10, 20]        B           3

1 个答案:

答案 0 :(得分:1)

您可以将cut_frame附加到原始df,然后应用groupby

df["time_window"] = cut_frame
df.groupby(["location", "time_window"]).count().dropna()