用于群集框的pandas boxplot:如何设置多级x轴标签

时间:2015-04-09 13:01:34

标签: python pandas boxplot

基本上,我有以下pandas数据帧:

Climate	        Soil	Crop	irr
Temperate	Pcg	Cabbage	103.5
Temperate	Pcg	Cabbage	111.1
Temperate	Pcg	Cabbage	170.1
Temperate	Pcg	Cabbage	119.3
Temperate	Scg	Cabbage	123.8
Temperate	Scg	Cabbage	132.3
Temperate	Scg	Cabbage	191.9
Temperate	Scg	Cabbage	129.4
Temperate	Zcg	Cabbage	138
Temperate	Zcg	Cabbage	137
Temperate	Zcg	Cabbage	205.3
Temperate	Zcg	Cabbage	155.3
Continental	Pcg	Cabbage	129.6
Continental	Pcg	Cabbage	224.9
Continental	Pcg	Cabbage	259.7
Continental	Pcg	Cabbage	142.6
Continental	Scg	Cabbage	151.6
Continental	Scg	Cabbage	254.3
Continental	Scg	Cabbage	283.5
Continental	Scg	Cabbage	162.1
Continental	Zcg	Cabbage	158.1
Continental	Zcg	Cabbage	275.7
Continental	Zcg	Cabbage	290.8
Continental	Zcg	Cabbage	180.1
Subtropical	Pcg	Cabbage	441
Subtropical	Pcg	Cabbage	515.4
Subtropical	Pcg	Cabbage	554.6
Subtropical	Pcg	Cabbage	495.2
Subtropical	Scg	Cabbage	465.7
Subtropical	Scg	Cabbage	538.2
Subtropical	Scg	Cabbage	567.8
Subtropical	Scg	Cabbage	510.1

我想绘制按气候和土壤类型分组的“irr”变量。使用下面的python代码,我设法得到了我想要的东西:

fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10, 7))
axes = axes.reshape(-1) # linearise axes 2d array for easy looping
for i, crp in enumerate(["Cabbage", "Green beans", "Potato", "Wheat"]):
    data=irrdata[irrdata.Crop==crp]                         
    data.boxplot(column='irr', by=['Climate','Soil'], ax=axes[i], grid=False) 

    
#    axes[i].get_xaxis().set_visible(False)
    
    axes[i].set_ylim([0, 600])
    axes[i].set_title("")   # no title for subplots

    # adjust location of plots titles
    axes[i].text(.85, .85, crp,
             horizontalalignment='center',
             transform=axes[i].transAxes)

# Irrigation requirements by soil type
fig.suptitle("")  # flush the old super titles
axes[0].set_ylabel("Mean irrigation requirement (mm)") # top LEFT
axes[2].set_ylabel("Mean irrigation requirement (mm)") # bot left
plt.tight_layout()
好吧,差不多因为结果不是很令人满意(请注意我只提供卷心菜的数据,但其他的处理方式完全相同):

enter image description here

您可以看到X轴标签是如何重叠的。此外,我不想重复每种土壤类型的气候类型。相反,我想根据气候类别对土壤类型进行分组:

Zcg Scg Pcg   Zcg Scg Pcg   Zcg Scg Pcg
 Temperate    Subtropical   Continental

我希望气候和土壤按以下顺序出现在情节中:

Temperate-> Subtropical->欧式

Zcg-> Scg-> PCG

我正在使用Pandas版本0.15.2

1 个答案:

答案 0 :(得分:0)

Seaborn这样做,包括指定一个非字母顺序:

import seaborn as sns
cbg = pd.read_clipboard()
cbg.columns = ['Climate','Soil','Crop','irr']
sns.factorplot('Soil', x_order=['Zcg','Scg','Pcg'],
               col='Climate', col_order=['Temperate','Subtropical','Continental'], 
               row='Crop', y='irr', kind='box', data=cbg)

enter image description here