Question

我想知道如何在Pandas中将for循环转换为.apply（）方法。我正在尝试遍历数据帧的一列（df1），并从第二个数据帧的子集（df2）返回匹配项。我有一个函数来进行匹配（匹配），也有一个函数从df2（过滤器）中选择正确的子集。我想知道是否可以使用Pandas的.apply（）方法来调用这两个函数。

我已经找到了如何以for循环的方式执行此操作（请参见下文），并且似乎可以通过首先创建完整的函数（请参见here）来进行列表理解，但是我我无法通过Pandas .apply（）方法和lambda表达式来执行此操作。

## Here is my Filter, which selects titles from df2 for one year 
## either side of a target year
def Filter (year):
    years = [year-1, year, year+1]
    return df2[df2.year.isin(years)].title

# Here is my matching routine, it uses the process method from
# fuzzywuzzy
def Matcher(title, choices):
    title_match, percent_match, match3 = process.extractOne(title, 
choices, scorer=fuzz.token_sort_ratio)
    return title_match

# Here is my solution using a for-loop
for index, row in df1.iterrows():
    targets = Filter(row.year)
    df1.loc[index,'return_match'] = Matcher(row.title, targets)

# Here's my attempt with a lambda function
df1['new_col'] = df1.title.apply(lambda x: Matcher(x, 
Filter(df1.year)))

当我使用lambda函数时，似乎发生的事情是Filter函数仅在.apply（）方法的第一次迭代中被调用，因此每个标题都与该第一个被过滤的集合匹配。有办法解决吗？

Answer 1

欢迎来到SO JP。我在这行看到一个问题：

# Here's my attempt with a lambda function
df1['new_col'] = df1.title.apply(lambda x: Matcher(x, Filter(df1.year)))

您在所有DataFrame列Filter上调用year，而作为for循环解决方案，您只想在该行的年份调用它。因此，我建议对以下行使用Apply：

df1['new_col'] = df1.apply(lambda row: Matcher(row.title, Filter(row.year)), axis=1)

我希望这会有所帮助。

在Python中的.apply（）调用中使用lambda函数时，如何调用第二个函数？

1 个答案: