Question

下面是我的测试代码：

# coding: utf8

import pandas as pd

def main():

    df = pd.DataFrame({
        'name': ['AA', 'BB', 'CC'],
        'age': [23, 33, 29],
    })
    print('Pandas verson is', pd.__version__)
    print(df)

    print(' return tuple '.center(80, '='))
    df_out = df.apply(calc_discount, axis=1)
    print(df_out)

    print(' return series '.center(80, '='))
    df_out = df.apply(lambda row: pd.Series(calc_discount(row)), axis=1)
    print(df_out)    

def calc_discount(row):
    print('~~~ row ~~~')
    label = row['name'][0] + '_label'
    discount = row['age'] // 3
    return label, discount


if __name__ == '__main__': main()

以下是供您参考的结果：

Pandas verson is 1.0.5
  name  age
0   AA   23
1   BB   33
2   CC   29
================================= return tuple =================================
~~~ row ~~~
~~~ row ~~~
~~~ row ~~~
0     (A_label, 7)
1    (B_label, 11)
2     (C_label, 9)
dtype: object
================================ return series =================================
~~~ row ~~~
~~~ row ~~~
~~~ row ~~~
~~~ row ~~~
         0   1
0  A_label   7
1  B_label  11
2  C_label   9

apply函数返回一个元组时，calc_discount函数将按预期被调用3次。但是当我更改返回类型时，函数调用变得有些奇怪。

有人知道为什么apply返回pd.Series时还要执行额外的时间吗？

非常感谢您〜

熊猫返回系列时需要额外的时间

0 个答案: