Question

所以我有一个自定义函数，我想要应用于DataFrame中的一行数据，但是如何包含我需要的额外参数我给出了一个示例

# Using df.apply
df = pd.DataFrame({"A": [1,2,3]})
sum_A = np.sum(df.A)

def calc_weight(row, total):
    row["weights"] = row["A"]/total

df.apply(calc_weight(row, sum_A), axis = 1)
# Gives NameError: name 'row' is not defined

df.apply(calc_weight(row, sum_A), axis = 1)
# TypeError: calc_weight() missing 1 required positional argument: 'total'

我想要的输出类似于：

  A weights
0 1  0.166 
1 2  0.333
2 3   0.5

我已经在线查看，但我似乎无法找到任何内容，或者我是否必须默认使用for循环来执行此类操作？

Answer 1

尝试在apply函数中添加参数，如下所示：

import pandas as pd                                                                                                  
import numpy as np

df = pd.DataFrame({"A": [1,2,3]})                                                                                    
sum_A = np.sum(df.A)                                                                                                 

def f(a, total):
    return float(a)/total                                                                                            

df['weight'] = df['A'].apply(f, args=(sum_A,))                                                                       
print df

输出：

   A    weight
0  1  0.166667
1  2  0.333333
2  3  0.500000

〜

如何将pd.apply与我自己的自定义函数一起使用，该函数接受1个输入参数

1 个答案: