大熊猫每组归咎于缺失值

时间:2016-09-21 12:37:12

标签: python pandas group-by missing-data imputation

如何为大熊猫中的每个指标实现这样的每个国家的估算?

我想将每组的缺失值归为

  • no-A-state 每个指标KPI应获得np.min
  • no-ISO-state 应该获得每个指标的np.mean
  • 对于缺少值的状态,我想用每indicatorKPI个意思来估算。在这里,这意味着要归咎于塞尔维亚的缺失值

    mydf = pd.DataFrame({'国家':[' no-A-state',' no-ISO-state','德国'塞尔维亚'奥地利'德国'塞尔维亚'奥地利', ],' indicatorKPI':[np.nan,np.nan,' SP.DYN.LE00.IN',NY.GDP.MKTP.CD', ' NY.GDP.MKTP.CD',' SP.DYN.LE00.IN',' NY.GDP.MKTP.CD',' SP .DYN.LE00.IN'],'值':[np.nan,np.nan,0.9,np.nan,0.7,0.2,0.3,0.6]}) enter image description here

修改

所需的输出应类似于

mydf = pd.DataFrame({'Country':['no-A-state','no-ISO-state', 'no-A-state','no-ISO-state',
                                'germany','serbia','serbia', 'austria', 
                                'germany','serbia', 'austria',],
                   'indicatorKPI':['SP.DYN.LE00.IN','NY.GDP.MKTP.CD', 'SP.DYN.LE00.IN',
                                   'SP.DYN.LE00.IN','NY.GDP.MKTP.CD','SP.DYN.LE00.IN','NY.GDP.MKTP.CD','NY.GDP.MKTP.CD', 'SP.DYN.LE00.IN','NY.GDP.MKTP.CD', 'SP.DYN.LE00.IN'],
                     'value':['MIN of all for this indicator', 'MEAN of all for this indicator','MIN of all for this indicator','MEAN of all for this indicator', 0.9,'MEAN of all for SP.DYN.LE00.IN indicator',0.7, 'MEAN of all for NY.GDP.MKTP.CD indicator',0.2, 0.3, 0.6]
                   })

enter image description here

1 个答案:

答案 0 :(得分:2)

根据您的新示例,以下内容适用于我:

url[23:]+ "/skip_session/id=%s/" + url[:52]

基本上这样做是为了填补每个条件的缺失值,所以我们设置了“没有A状态”的最小值。国家,然后意味着没有ISO国家'国家。然后,我们将指标KPI'并计算每个组的均值并再次分配给空值行,各个国家'意味着使用执行查找的In [185]: mydf.loc[mydf['Country'] == 'no-A-state', 'value'] = mydf['value'].min() mydf.loc[mydf['Country'] == 'no-ISO-state', 'value'] = mydf['value'].mean() mydf.loc[mydf['value'].isnull(), 'value'] = mydf['indicatorKPI'].map(mydf.groupby('indicatorKPI')['value'].mean()) mydf Out[185]: Country indicatorKPI value 0 no-A-state SP.DYN.LE00.IN 0.200000 1 no-ISO-state NY.GDP.MKTP.CD 0.442857 2 no-A-state SP.DYN.LE00.IN 0.200000 3 no-ISO-state SP.DYN.LE00.IN 0.442857 4 germany NY.GDP.MKTP.CD 0.900000 5 serbia SP.DYN.LE00.IN 0.328571 6 serbia NY.GDP.MKTP.CD 0.700000 7 austria NY.GDP.MKTP.CD 0.585714 8 germany SP.DYN.LE00.IN 0.200000 9 serbia NY.GDP.MKTP.CD 0.300000 10 austria SP.DYN.LE00.IN 0.600000

以下是分解的步骤:

map