我有一个如下所示的数据框:
df:
RY MAJ_CAT Value
2016 Cause Unknown 0.00227
2016 Vegetation 0.04217
2016 Vegetation 0.04393
2016 Vegetation 0.07878
2016 Defective Equip 0.00137
2018 Cause Unknown 0.00484
2018 Defective Equip 0.01546
2020 Defective Equip 0.05169
2020 Defective Equip 0.00515
2020 Cause Unknown 0.00050
我想绘出给定年份的价值分布。所以我通过使用以下代码使用了seaborn的distplot:
year_2016 = df[df['RY']==2016]
year_2018 = df[df['RY']==2018]
year_2020 = df[df['RY']==2020]
sns.distplot(year_2016['value'].values, hist=False,rug=True)
sns.distplot(year_2018['value'].values, hist=False,rug=True)
sns.distplot(year_2020['value'].values, hist=False,rug=True)
下一步,我要绘制给定年份内MAJ_CAT的相同值分布。所以我决定使用seaborn的Facetgrid,下面是代码:
g = sns.FacetGrid(df,col='MAJ_CAT')
g = g.map(sns.distplot,df[df['RY']==2016]['value'].values, hist=False,rug=True))
g = g.map(sns.distplot,df[df['RY']==2018]['value'].values, hist=False,rug=True))
g = g.map(sns.distplot,df[df['RY']==2020]['value'].values, hist=False,rug=True))
但是,当它运行上面的命令时,它将引发以下错误:
KeyError: "None of [Index([(0.00227, 0.04217, 0.043930000000000004, 0.07877999999999999, 0.00137, 0.0018800000000000002, 0.00202, 0.00627, 0.00101, 0.07167000000000001, 0.01965, 0.02775, 0.00298, 0.00337, 0.00088, 0.04049, 0.01957, 0.01012, 0.12065, 0.23699, 0.03639, 0.00137, 0.03244, 0.00441, 0.06748, 0.00035, 0.0066099999999999996, 0.00302, 0.015619999999999998, 0.01571, 0.0018399999999999998, 0.03425, 0.08046, 0.01695, 0.02416, 0.08975, 0.0018800000000000002, 0.14743, 0.06366000000000001, 0.04378, 0.043, 0.02997, 0.0001, 0.22799, 0.00611, 0.13960999999999998, 0.38871, 0.018430000000000002, 0.053239999999999996, 0.06702999999999999, 0.14103, 0.022719999999999997, 0.011890000000000001, 0.00186, 0.00049, 0.13947, 0.0067, 0.00503, 0.00242, 0.00137, 0.00266, 0.38638, 0.24068, 0.0165, 0.54847, 1.02545, 0.01889, 0.32750999999999997, 0.22526, 0.24516, 0.12791, 0.00063, 0.0005200000000000001, 0.00921, 0.07665, 0.00116, 0.01042, 0.27046, 0.03501, 0.03159, 0.46748999999999996, 0.022090000000000002, 2.2972799999999998, 0.69021, 0.22529000000000002, 0.00147, 0.1102, 0.03234, 0.05799, 0.11744, 0.00896, 0.09556, 0.03202, 0.01347, 0.00923, 0.0034200000000000003, 0.041530000000000004, 0.04848, 0.00062, 0.0031100000000000004, ...)], dtype='object')] are in the [columns]"
我不确定我在哪里犯错。有人可以帮我解决问题吗?
答案 0 :(得分:4)
import pandas as pd
import numpy as np
import seaborn as sns
# setup dataframe of synthetic data
np.random.seed(365)
data = {'RY': [random.choice([2016, 2018, 2020]) for _ in range(400)],
'MAJ_CAT': [random.choice(['Cause Unknown', 'Vegetation', 'Defective Equip']) for _ in range(400)],
'Value': [np.random.random() for _ in range(400)]}
df = pd.DataFrame(data)
for year in df.RY.unique():
values = df.Value[df.RY == year]
sns.distplot(values, hist=False, rug=True)
hue
添加到FacetGrid
g = sns.FacetGrid(df, col='MAJ_CAT', hue='RY')
p1 = g.map(sns.distplot, 'Value', hist=False, rug=True).add_legend()