在Seaborn FacetGrid图上绘制不同“色调”数据的平均线

时间:2017-07-06 23:08:20

标签: python pandas matplotlib seaborn facet-grid

我正在使用泰坦尼克号乘客数据集(来自Kaggle)作为Udacity课程的一部分。我正在使用Seaborn FacetGrid来查看旅行班和性别的乘客年龄分布情况 - 色调为“幸存”(1/0)。

情节运作良好,我想为每个子情节添加垂直平均线 - 但是每个子情节(1/0)中的两个“色调”中的每一个都有不同的颜色(并且具有不同的注释)。下面代码中的“vertical_mean_line”函数在没有多个“色调”数据的情节下效果很好 - 但我找不到为每种色调绘制不同线条的方法

任何想法是否可以在Seaborn中做到这一点?

目前的Seaborn FacetGrid情节输出:

Seaborn FacetGrid plot

代码:

sns.set()
sns.set_context('talk')
sns.set_style('darkgrid')
grid = sns.FacetGrid(titanic_data.loc[titanic_data['is_child_def'] == False], col='Sex', row = 'Pclass', hue='Survived' ,size=3.2, aspect=2)
grid.map(sns.kdeplot, 'Age', shade=True)
grid.set(xlim=(14, titanic_data['Age'].max()), ylim=(0,0.06))
grid.add_legend()


# Add vertical lines for mean age on each plot
def vertical_mean_line_survived(x, **kwargs):
    plt.axvline(x.mean(), linestyle = '--', color = 'g')
    #plt.text(x.mean()+1, 0.052, 'mean = '+str('%.2f'%x.mean()), size=12)
    #plt.text(x.mean()+1, 0.0455, 'std = '+str('%.2f'%x.std()), size=12)

grid.map(vertical_mean_line_survived, 'Age') 

# Add text to each plot for relevant popultion size
# NOTE - don't need to filter on ['Age'].isnull() for children, as 'is_child'=True only possible for children with 'Age' data
for row in range(grid.axes.shape[0]):
    grid.axes[row, 0].text(60.2, 0.052, 'Survived n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==1]['is_male'].sum()), size = 12)
    grid.axes[row, 1].text(60.2, 0.052, 'Survived n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==1]['is_female'].sum()), size = 12)
    grid.axes[row, 0].text(60.2, 0.047, 'Perished n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==0]['is_male'].sum()), size = 12)
    grid.axes[row, 1].text(60.2, 0.047, 'Perished n = '+str(titanic_data.loc[titanic_data['Pclass']==row+1].loc[titanic_data['is_child_def']==False].loc[titanic_data['Age'].isnull()==False].loc[titanic_data['Survived']==0]['is_female'].sum()), size = 12)



grid.set_ylabels('Frequency density', size=12)

# Squash down a little and add title to facetgrid    
plt.subplots_adjust(top=0.9)
grid.fig.suptitle('Age distribution of adults by Pclass and Sex for Survived vs. Perished')

1 个答案:

答案 0 :(得分:4)

kwargs包含相应色调的标签和颜色。因此,使用

def vertical_mean_line_survived(x, **kwargs):
    ls = {"0":"-","1":"--"}
    plt.axvline(x.mean(), linestyle =ls[kwargs.get("label","0")], 
                color = kwargs.get("color", "g"))
    txkw = dict(size=12, color = kwargs.get("color", "g"), rotation=90)
    tx = "mean: {:.2f}, std: {:.2f}".format(x.mean(),x.std())
    plt.text(x.mean()+1, 0.052, tx, **txkw)

我们会得到

enter image description here