使用Pandas计算新列

时间:2016-09-26 13:07:47

标签: python python-2.7 csv pandas

基于this问题,我想知道如何使用def()来计算带有Pandas的新列并使用多个参数(字符串和整数)?

具体例子:

df_joined["IVbest"] = IV(df_joined["Saison"], df_joined["Wald_Typ"], df_joined["NS_Cap"])

“Saison”,“Wald_Typ”是字符串“NS_Cap”是整数

现在我想通过这个定义运行所有这些值并再次返回一个x值:

def IV(saison, wald, ns):
    if saison == "Sommer":
        if wald == "Laubwald":
            x = ns * 0.1
        elif wald == "Nadelwald":
            x = ns * 0.2
        elif wald == "Mischwald":
            x = ns * 0.3
    elif saison == "Winter":
        if wald == "Laubwald":
            x = ns * 0.01
        elif wald == "Nadelwald":
            x = ns * 0.02
        elif wald == "Mischwald":
            x = ns * 0.03
    return x

我如何做到最好?

我尝过像

这样的东西
df_joined["IVbest"] = IV(df_joined["Saison", "Wald_Typ", "NS_Cap"])

df_joined["IVbest"] = df_joined["Saison", "Wald_Typ", "NS_Cap"].apply(IV)

但没有任何作用:(

1 个答案:

答案 0 :(得分:0)

我认为在这种情况下最好使用6个掩码并使用它们来执行这些行的计算:

sommer_laub = (df_joined['Saison'] == 'Sommer') & (df_joined['Wald_Typ'] == 'Laubwald')
sommer_nadel = (df_joined['Saison'] == 'Sommer') & (df_joined['Wald_Typ'] == 'Nadelwald')
sommer_misch = (df_joined['Saison'] == 'Sommer') & (df_joined['Wald_Typ'] == 'Mischwald')
winter_laub = (df_joined['Saison'] == 'Winter') & (df_joined['Wald_Typ'] == 'Laubwald')
winter_nadel = (df_joined['Saison'] == 'Winter') & (df_joined['Wald_Typ'] == 'Nadelwald')
winter_misch = (df_joined['Saison'] == 'Winter') & (df_joined['Wald_Typ'] == 'Mischwald')
df.loc[sommer_laub, 'IVbest'] = df.loc[sommer_laub,'NS_Cap'] * 0.1
df.loc[sommer_nadel, 'IVbest'] = df.loc[sommer_nadel,'NS_Cap'] * 0.2
df.loc[sommer_misch, 'IVbest'] = df.loc[sommer_misch,'NS_Cap'] * 0.3
df.loc[winter_laub, 'IVbest'] = df.loc[winter_laub,'NS_Cap'] * 0.01
df.loc[winter_nadel, 'IVbest'] = df.loc[winter_nadel,'NS_Cap'] * 0.02
df.loc[winter_misch, 'IVbest'] = df.loc[winter_misch,'NS_Cap'] * 0.03