Question

我正在尝试使用行中的其他值为较低置信区间创建新列。我已在public-health-cis上编写（并发布）置信区间计算作为包pypi。这些函数接受浮点值并返回浮点数。

在我的分析脚本中，我试图从pandas数据帧中调用此函数。我已经尝试了几种方法来尝试使其工作，但无济于事。

    df_for_ci_calcs = df[['Value', 'Count', 'Denominator']].copy()
    df_for_ci_calcs = df_for_ci_calcs.applymap(lambda x: -1 if x == '*' else x)
    df_for_ci_calcs = df_for_ci_calcs.astype(np.float)
    df['LowerCI'].apply(lambda x: public_health_cis.wilson_lower(df_for_ci_calcs['Value'].astype(float),
                                      df_for_ci_calcs['Count'].astype(float), 
                                      df_for_ci_calcs['Denominator'].astype(float), indicator.rate))

回过头来追溯：

内部服务器错误：/

df['LowerCI'].apply(lambda x: public_health_cis.wilson_lower(df_for_ci_calcs['Value'].astype(float), df_for_ci_calcs['Count'].astype(float), df_for_ci_calcs['Denominator'].astype(float), indica
tor.rate))   

TypeError: cannot convert the series to <class 'float'>

我也尝试过使用：

df['LowerCI'] = df_for_ci_calcs.applymap(lambda x: public_health_cis.wilson_lower(df_for_ci_calcs['Value'], df_for_ci_calcs['Count'],
                                                         df_for_ci_calcs['Denominator'], indicator.rate), axis=1)

传递错误：

applymap() got an unexpected keyword argument 'axis'

当我取出轴kwarg时，我得到与第一种方法相同的错误。那么，如何将每行中的值传递给函数以根据这些行中的数据获取值？

Answer 1

我认为您需要apply与axis=1进行逐行处理，因此请输入float s：

df['LowerCI'] = df[['Value', 'Count', 'Denominator']]
                .replace('*', -1)
                .astype(float)
                .apply(lambda x: public_health_cis.wilson_lower(x['Value'],
                                                                x['Count'], 
                                                                x['Denominator'], 
                                                                indicator.rate), 
                                                                axis=1)

示例（为了简化我将indicator.rate更改为标量100）：

df = pd.DataFrame({'Value':['*',2,3],
                   'Count':[4,5,6],
                   'Denominator':[7,8,'*'],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)
   Count  D Denominator  E  F Value
0      4  1           7  5  7     *
1      5  3           8  3  4     2
2      6  5           *  6  3     3

df['LowerCI'] = df[['Value', 'Count', 'Denominator']] \
                .replace('*', -1) \
                .astype(float) \
                .apply(lambda x: public_health_cis.wilson_lower(x['Value'],
                                                                x['Count'], 
                                                                x['Denominator'],  
                                                                100), axis=1)

print (df)
   Count  D Denominator  E  F Value    LowerCI
0      4  1           7  5  7     *  14.185885
1      5  3           8  3  4     2  18.376210
2      6  5           *  6  3     3  99.144602

在Pandas

1 个答案: