Question

我一直试图弄清楚如何将两个变量（行）传递给函数并获得输出，但是我在语法上遇到了很多麻烦。

我整天都在撞墙。这是我已经看过的东西：

（我认为我使用的是错误的套用） Pandas: How to apply a function to different columns

Difference between map, applymap and apply methods in Pandas

我重新阅读了Apply，但没有帮助。我正在使用Titanic数据集（https://github.com/alexisperrier/packt-aml/blob/master/ch4/titanic.csv），并尝试用集合中的集合编号替换空龄。我尝试了两种方法来做到这一点：

Titanic.loc[(Titanic['pclass'] == 1) & (Titanic['age'].isnull()), 'age'] = 35
Titanic.loc[(Titanic['pclass'] == 2) & (Titanic['age'].isnull()), 'age'] = 25
Titanic.loc[(Titanic['pclass'] == 3) & (Titanic['age'].isnull()), 'age'] = 20

（此代码可以正常工作，用预定值替换空的“年龄”）。不过，我的第一个尝试是创建一个函数并应用它。功能：

def ClassAge(age,pclass):
    if age.isnull:
        if pclass == 1:
            n = 35
        if pclass == 2:
            n = 25
        if pclass == 3:
            n = 20
    return(n)

我尝试通过以下方式应用它：

Titanic.age.apply(ClassAge,Titanic['pclass'], axis=1)

输出：

ValueError：系列的真值不明确。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）。

根据我在其他答案中读到的内容，我尝试了此操作，因为apply假定行是输入。

Titanic[['age','pclass']].apply(ClassAge)

这是给我的：

TypeError ：（“ ClassAge（）缺少1个必需的位置参数：'pclass'“，“在索引年龄出现”）

如上所述，我确实使用.loc解决了该问题，但仅出于教育目的，我想了解我在编写函数或调用函数（或可能调用两者）时正在做什么。

Answer 1

在行上应用lambda而不是传递整个pclass系列时，只需传递行值

df.apply(lambda x: ClassAge(x['age'],x['pclass']), axis=1)

熊猫：运行要应用于数据集的函数

1 个答案: