根据其他列的值创建新列

时间:2020-09-02 17:10:56

标签: python python-3.x pandas data-science

在我的df中,我为下面的每个实体(Grubhub,Toasttab和Tenk)创建一列,并在每一行的该列的值中表示是或否。

我有以下代码,例如:

df['Grubhub'] = df[['On GrubHub or Seamless?']].apply(lambda x: any(x == 'Yes'), axis = 1)

df['ToastTab'] = df[['On ToastTab?']].apply(lambda x: any(x == 'Yes'), axis = 1)

df['Tenk'] = df[['On Tenk?']].apply(lambda x: any(x == 'Yes'), axis = 1)

df['Udemy'] = df[['On Udmey?']].apply(lambda x: any(x == 'Yes'), axis = 1)

df['Postmates'] = df[['On Postmates?']].apply(lambda x: any(x == 'Yes'), axis = 1)

df['Doordash'] = df[['On DoorDash?']].apply(lambda x: any(x == 'Yes'), axis = 1)

df['Google'] = df[['On Goole?']].apply(lambda x: any(x == 'Yes'), axis = 1)

这为我为每个实体(Grubhub,Toasttab,Tenk)提供了一个新列,并且该列给出了false值的真值,是否有一种更有效的方法可以在一行代码或函数中完成所有这些操作?感谢您的提前帮助

1 个答案:

答案 0 :(得分:2)

您可以创建一个柱形图,并在function内应用loop

columns_map = (
    ('Grubhub', 'On GrubHub or Seamless?'),
    ('ToastTab', 'On ToastTab?'),
    ('Tenk', 'On Tenk?'),
    # etc ...
)

for new_col, alias in columns_map:
    df[new_col] = df[alias].apply(lambda x: x == 'Yes')
    # also you can easily remove aliases columns:
    # df = df.drop(columns=[alias])

或者您可以将值设置到原始列中,并根据需要重命名(无需drop()):

for new_col, alias in columns_map:
    df[alias] = df[alias].apply(lambda x: x == 'Yes')

df.rename(
    columns={alias: new_col for new_col, alias in columns_map},
    inplace=True
)