Question

试图回答另一个问题，我一直在玩大熊猫中的逐列乘法运算。

A = pd.DataFrame({'Col1' : [1, 2, 3], 'Col2' : [2, 3, 4]})
B = pd.DataFrame({'Col1' : [10, 20, 30]})

print(A)

   Col1  Col2
0     1     2
1     2     3
2     3     4

print(B)

   Col1
0    10
1    20
2    30

我尝试使用df.apply尝试将Col1的{{1}}乘以A的每一列。所以我想要的输出是：

我的第一次尝试是使用Col1 Col2 0 10 20 1 40 60 2 90 120，它运行良好。

lambda

但是lambdas总是很慢，所以我认为我可以通过传递df_new = A.apply(lambda x: B.Col1.values * x, 0) print(df_new) Col1 Col2 0 10 20 1 40 60 2 90 120来加快速度，但这就是它所给出的：

B.col1.values.__mul__

我打印出print(A.apply(B.Col1.values.__mul__, 0)) Col1 NotImplemented Col2 NotImplemented dtype: object，它是numpy数组中乘法的神奇方法：

__mul__

为什么我会收到此错误？

Answer 1

你可以这样做：

A.apply(B.Col1.__mul__,0)

返回您所追求的内容。

区别在于B.Col1.values.__mul__正在调用numpy slot函数，但B.Col1.__mul__正在调用pandas方法。

可能大熊猫的方法是为了避免numpy引起的一些低级头痛：

>>>print(inspect.getsource(pd.Series.__mul__))

def wrapper(left, right, name=name, na_op=na_op):

    if isinstance(right, pd.DataFrame):
        return NotImplemented

    left, right = _align_method_SERIES(left, right)

    converted = _Op.get_op(left, right, name, na_op)

    left, right = converted.left, converted.right
    lvalues, rvalues = converted.lvalues, converted.rvalues
    dtype = converted.dtype
    wrap_results = converted.wrap_results
    na_op = converted.na_op

    if isinstance(rvalues, ABCSeries):
        name = _maybe_match_name(left, rvalues)
        lvalues = getattr(lvalues, 'values', lvalues)
        rvalues = getattr(rvalues, 'values', rvalues)
        # _Op aligns left and right
    else:
        name = left.name
        if (hasattr(lvalues, 'values') and
                not isinstance(lvalues, pd.DatetimeIndex)):
            lvalues = lvalues.values

    result = wrap_results(safe_na_op(lvalues, rvalues))
    return construct_result(
        left,
        result,
        index=left.index,
        name=name,
        dtype=dtype,
    )

无法在np插槽功能上找到源代码，但它可能类似于this

Pandas v0.20在乘以dataframe列时返回NotImplemented

1 个答案: