Question

我有下面的代码，它将代码覆盖在直方图上。它将对我数据中的“ Fresh”字段（连续的字段）执行此操作。我想创建类似的图，并根据“渠道”字段中的唯一值进行过滤。例如，在大熊猫中创建类似于我要完成的直方图的方法：

data_df.hist(column=‘Fresh’,by=‘Channel’)

有人可以建议如何对下面的seaborn代码执行类似的操作吗？

代码：

import seaborn as sns

sns.distplot(data_df[‘Fresh’], hist=True, kde=True, 
                             bins=int(data_df.shape[0]/5), color = 'darkblue', 
                             hist_kws={'edgecolor':'black'},
                             kde_kws={'linewidth': 4})

数据

  Channel  Fresh
0        2  12669
1        2   7057
2        2   6353
3        1  13265
4        2  22615
5        2   9413
6        2  12126
7        2   7579
8        1   5963
9        2   6006

Answer 1

我认为Seaborn的方法是创建一个FacetGrid，然后在其上map一个轴级绘图函数。就您而言：

g = sns.FacetGrid(data_df, col='Channel', margin_titles=True)
g.map(sns.distplot, 
      'Fresh',
      bins=int(data_df.shape[0]/5),
      color='darkblue', 
      hist_kws={'edgecolor': 'black'},
      kde_kws={'linewidth': 4});

查看文档以了解更多信息：https://seaborn.pydata.org/tutorial/axis_grids.html

Answer 2

或者，您可以基于groupby Channel来使用DataFrame，然后将两组绘制在不同的子图中

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data_df = pd.DataFrame({'Channel': [2, 2, 2, 1, 2, 2, 2, 2, 1, 2],
                        'Fresh': [12669,  7057,  6353, 13265, 22615,  
                                  9413, 12126,  7579,  5963,6006]})
df1 = data_df.groupby('Channel')

fig, axes = plt.subplots(nrows=1, ncols=len(df1), figsize=(10, 3))

for ax, df in zip(axes.flatten(), df1.groups):
    sns.distplot(df1.get_group(df)['Fresh'], hist=True, kde=True, 
                             bins=int(data_df.shape[0]/5), color = 'darkblue', 
                             hist_kws={'edgecolor':'black'},
                             kde_kws={'linewidth': 4}, ax=ax)

plt.tight_layout()

通过分类场创建连续场的密度图

2 个答案: