我是大熊猫的新手,并试图了解这个功能。
iris = datasets.load_iris()
grps = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
iris = pd.DataFrame(iris.data[:,0],columns = ["SL"])
iris['bins'] = pd.cut(iris["SL"], 10 ,labels = grps )
iris['target'] = datasets.load_iris().target
我想知道每个箱子中每个班级的出现次数。我怎么做这个,因为我似乎无法想办法。我想将此输出绘制为堆叠直方图
答案 0 :(得分:3)
试试这个:
In [83]: import seaborn as sns
In [84]: x = iris.groupby(['target','bins']).size().to_frame('occurences').reset_index()
In [85]: x
Out[85]:
target bins occurences
0 0 1 9
1 0 2 19
2 0 3 12
3 0 4 9
4 0 5 1
5 1 2 3
6 1 3 2
7 1 4 16
8 1 5 13
9 1 6 7
10 1 7 7
11 1 8 2
12 2 2 1
13 2 4 2
14 2 5 8
15 2 6 13
16 2 7 11
17 2 8 4
18 2 9 5
19 2 10 6
In [86]: sns.barplot(x='bins', y='occurences', hue='target', data=x)
Out[86]: <matplotlib.axes._subplots.AxesSubplot at 0x9b3bb38>
答案 1 :(得分:1)
您可以将value_counts
分类到与cut
中指定的组对应的容器中后使用。
sns.set_style('darkgrid')
df = pd.DataFrame(datasets.load_iris().data[:,0], columns=['SL'])
df['target'] = datasets.load_iris().target
# Total number of bins to be grouped under
grps = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# Empty list to append later
grouped_list = []
# Iterating through grouped by target variable
for label, key in df.groupby('target'):
grouped_list.append(pd.cut(key['SL'], bins = grps).value_counts())
# Concatenate column-wise and create a stacked-bar plot
pd.concat(grouped_list, axis=1).add_prefix('class_').plot(kind='bar', stacked=True, rot=0,
figsize=(6,6), cmap=plt.cm.rainbow)
# Aesthetics
plt.title("Sepal Length binned counts")
plt.xlabel("Buckets")
plt.ylabel("Occurences")
sns.plt.show()
{{3}}