我在pandas DataFrame中有数据,我想创建一个交互式的boxplot,允许我选择天数,同时在“category”列中为每个类别的值绘制一个boxplot
这是我的代码/数据到目前为止的样子:
import numpy as np
import pandas as pd
categories=('A','B','C')
data = {
'days': np.random.randint(120, size=100),
'category': np.random.choice(categories, 100),
'value': 100.0 * np.random.random_sample(100)
}
df = pd.DataFrame(data)
print(df)
category days value
0 A 4 77.383981
1 A 31 63.011934
2 A 5 1.165061
3 C 59 23.588979
4 A 57 14.906734
5 C 106 33.366634
6 A 29 90.658570
7 B 25 16.137490
8 A 118 34.526302
9 C 76 4.111797
10 A 11 30.195917
.. ... ... ...
90 A 64 37.529774
91 A 76 3.771360
92 C 112 93.948775
93 C 14 34.855189
94 B 64 83.106007
95 A 10 78.346319
96 B 86 66.645889
97 A 46 12.969012
98 C 29 57.925427
99 A 59 34.526146
[100 rows x 3 columns]
我想为每个类别创建一个箱线图(对于选定/指定的天数),不同的类别沿X轴绘制。
如何使用pandas(或matplotlib)进行此操作?
答案 0 :(得分:6)
您只需按天数过滤数据框,然后绘制相应的箱图。
numer_of_days = 42
df_filtered= df.loc[df['days'] < numer_of_days] # use operators like ==, >=, <, etc.
df_filtered[["category", "value"]].boxplot( by="category", return_type='axes')
<小时/> 为了获得下拉字段,您可以使用
ipywidgets.interact()
函数,您可以为其提供绘制特定日期的数据框的函数。
(在下文中,我将天数限制为12天,因此下拉列表实际上对于从那些天中选择一天是有意义的。)
import numpy as np
import pandas as pd
from ipywidgets import interact
%matplotlib notebook
categories=('A','B','C')
data = {
'days': np.random.randint(12, size=100),
'category': np.random.choice(categories, 100),
'value': 100.0 * np.random.random_sample(100)
}
df = pd.DataFrame(data)
def select_days(number_of_days):
df_filtered= df.loc[df['days'] == int(number_of_days)]
ax = df_filtered[["category", "value"]].boxplot( by="category", return_type='axes')
ax["value"].set_title("Day " + number_of_days)
print df_filtered
days = [str(day) for day in np.arange(12)]
interact(select_days, number_of_days=days)
答案 1 :(得分:0)
如何显示盒子,分布和小提琴图
f, axes = plt.subplots(5, 3, figsize=(20, 20))
colors = ["r", "g", "b", "m", "c"]
count = 0
for i in houseNumData:
sb.boxplot(houseNumData[i], orient = "h", color = colors[count], ax = axes[count,0])
sb.distplot(houseNumData[i], color = colors[count], ax = axes[count,1])
sb.violinplot(houseNumData[i], color = colors[count], ax = axes[count,2])
count += 1