将变量传递给pandas中的apply()

时间:2017-10-02 03:32:48

标签: python pandas apply

我无法正确地将函数应用于数据帧。我试图通过连接两个其他列中的字符串,传入一个分隔符,在数据框中创建一个新列。我收到了错误

TypeError: ("apply_join() missing 1 required positional argument: 'sep'", 'occurred at index cases')

如果我将sep添加到apply_join()函数调用中,那也会失败:

  File "unite.py", line 37, in unite
    tibble_extra = df[cols].apply(apply_join, sep)
NameError: name 'sep' is not defined

import pandas as pd
from io import StringIO

tibble3_csv = """country,year,cases,population
Afghanistan,1999,745,19987071
Afghanistan,2000,2666,20595360
Brazil,1999,37737,172006362
Brazil,2000,80488,174504898
China,1999,212258,1272915272
China,2000,213766,1280428583"""
with StringIO(tibble3_csv) as fp:
    tibble3 = pd.read_csv(fp)
print(tibble3)

def str_join_elements(x, sep=""):
    assert type(sep) is str
    return sep.join((str(xi) for xi in x))

def unite(df, cols, new_var, combine=str_join_elements):

    def apply_join(x, sep):
        joinstr = str_join(x, sep)
        return pd.Series({new_var[i]:s for i, s in enumerate(joinstr)})

    fixed_vars = df.columns.difference(cols)
    tibble = df[fixed_vars].copy()
    tibble_extra = df[cols].apply(apply_join)

    return pd.concat([tibble, tibble_extra], axis=1) 
table3_again = unite(tibble3, ['cases', 'population'], 'rate', combine=lambda x: str_join_elements(x, "/"))
print(table3_again)

2 个答案:

答案 0 :(得分:2)

如果您有多个参数,请使用lambda

df[cols].apply(lambda x: apply_join(x,sep),axis=1)

或借助args参数传递参数,即

 df[cols].apply(apply_join,args=[sep],axis=1)

答案 1 :(得分:1)

您只需将其添加到apply语句中:

tibble_extra = df[cols].apply(apply_join, sep=...)

另外,您应该指定轴。它可能没有它,但它是一个防止错误的好习惯:

tibble_extra = df[cols].apply(apply_join, sep=..., axis=1(columns) or 0(rows|default))