熊猫数据框| groupby绘图|堆叠并排图

时间:2019-12-26 07:03:59

标签: python pandas matplotlib

我来自R ggplot2背景,并且在matplotlib图中有点困惑

这是我的数据框

languages = ['en','cs','es', 'pt', 'hi', 'en', 'es', 'es']
counties = ['us','ch','sp', 'br', 'in', 'fr', 'ar', 'pr']
count = [32, 432,43,55,6,23,455,23]
df = pd.DataFrame({'language': languages,'county': counties, 'count' : count})

    language    county  count
0   en  us  32
1   cs  ch  432
2   es  sp  43
3   pt  br  55
4   hi  in  6
5   en  fr  23
6   es  ar  455
7   es  pr  23

现在我要绘制

  1. 堆积条形图,其中x轴显示语言,y轴显示完整计数,大的总高度显示该语言的总计数,堆积条形图显示该语言的国家/地区
  2. 并排显示,只有相同的参数,只有国家并排显示,而不是一个并排显示

大多数示例都直接使用dataframe和matplotlib绘图显示它,但我想在顺序脚本中对其进行绘图,因此我可以更好地控制它,还可以编辑所需的内容,例如该脚本

ind = np.arange(df.languages.nunique())
width = 0.35
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.bar(ind, df.languages, width, color='r')
ax.bar(ind, df.count, width,bottom=df.languages, color='b')
ax.set_ylabel('Count')
ax.set_title('Score  y language and country')
ax.set_xticks(ind, df.languages)
ax.set_yticks(np.arange(0, 81, 10))
ax.legend(labels=[df.countries])
plt.show()

btw,我的熊猫数据透视图代码

df.pivot(index = "Language", columns = "Country", values = "count").plot.bar(figsize=(15,10))
plt.xticks(rotation = 0,fontsize=18)
plt.xlabel('Language' )
plt.ylabel('Count ')
plt.legend(fontsize='large', ncol=2,handleheight=1.5)
plt.show()

2 个答案:

答案 0 :(得分:2)

import matplotlib.pyplot as plt

languages = ['en','cs','es', 'pt', 'hi', 'en', 'es', 'es']
counties = ['us','ch','sp', 'br', 'in', 'fr', 'ar', 'pr']
count = [32, 432,43,55,6,23,455,23]
df = pd.DataFrame({'language': languages,'county': counties, 'count' : count})

modified = {}
modified['language'] = np.unique(df.language)
country_count = []
total_count = []
for x in modified['language']:
    country_count.append(len(df[df['language']==x]))
    total_count.append(df[df['language']==x]['count'].sum())

modified['country_count'] = country_count
modified['total_count'] = total_count

mod_df = pd.DataFrame(modified)
print(mod_df)

ind = mod_df.language
width = 0.35 

p1 = plt.bar(ind,mod_df.total_count, width)
p2 = plt.bar(ind,mod_df.country_count, width,
             bottom=mod_df.total_count)

plt.ylabel("Total count")
plt.xlabel("Languages")
plt.legend((p1[0], p2[0]), ('Total Count', 'Country Count'))
plt.show()

首先,将数据框修改为下面的数据框。

  language  country_count  total_count
0       cs              1          432
1       en              2           55
2       es              3          521
3       hi              1            6
4       pt              1           55

这是情节:

enter image description here

由于国家/地区计数的值很小,因此您无法清楚地看到堆叠的国家/地区计数。

答案 1 :(得分:0)

import seaborn as sns
import matplotlib.pyplot as plt

figure, axis = plt.subplots(1,1,figsize=(10,5))
sns.barplot(x="language",y="count",data=df,ci=None)#,hue='county')
axis.set_title('Score  y language and country')
axis.set_ylabel('Count')
axis.set_xlabel("Language")

enter image description here

sns.countplot(x=df.language,data=df)

enter image description here