我已将数据切割成几个箱子,并希望在直方图中绘制每个箱子的大小/频率。 y轴是bin频率,x标签是bin范围。
目前我有:
out = pd.cut(data.hour, bins = filter_values, include_lowest = True)
counts = pd.value_counts(out)
print(counts)`
哪个输出:
[0, 5] 1000
(19, 23] 0
(15, 19] 0
(11, 15] 0
(8, 11] 0
(5, 8] 0
Name: hour, dtype: int64
我该怎么做?
答案 0 :(得分:1)
演示(一步一步):
In [51]: filter_values = [0, 5, 11, 15, 19, 23]
生成样本DF:
In [52]: df = pd.DataFrame({'hour':np.random.randint(0, 23, 20)})
In [53]: df
Out[53]:
hour
0 5
1 18
2 19
3 5
4 16
5 18
6 18
7 19
8 0
9 8
10 14
11 20
12 9
13 22
14 8
15 0
16 0
17 4
18 13
19 18
建筑垃圾箱
In [54]: out = pd.cut(data.hour, bins = filter_values, include_lowest = True)
In [55]: out
Out[55]:
0 [0, 5]
1 (11, 15]
2 (15, 19]
3 (11, 15]
4 (5, 11]
5 [0, 5]
6 [0, 5]
7 [0, 5]
8 [0, 5]
9 (5, 11]
10 [0, 5]
11 [0, 5]
12 (11, 15]
13 (15, 19]
14 (5, 11]
15 (15, 19]
16 (5, 11]
17 (19, 23]
18 (5, 11]
19 [0, 5]
Name: hour, dtype: category
Categories (5, object): [[0, 5] < (5, 11] < (11, 15] < (15, 19] < (19, 23]]
计数:
In [56]: counts = out.value_counts(sort=False)
In [57]: counts
Out[57]:
[0, 5] 8
(5, 11] 5
(11, 15] 3
(15, 19] 3
(19, 23] 1
Name: hour, dtype: int64
积
In [58]: counts.plot.bar(rot=False)
Out[58]: <matplotlib.axes._subplots.AxesSubplot at 0xa427b00>