我有一个这样的Pandas数据框:
A B C D
0 month month+1 quarter+1 season+1
1 season month+5 quarter+3 season+2
2 day month+1 quarter+2 season+1
3 year month+3 quarter+4 season+2
4 quarter month+2 quarter+1 season+1
5 month month+4 quarter+1 season+2
我想根据多个IF条件插入一个名为“ E”的新列。如果“ A”列等于“月”,则返回“ B”中的值,如果“ A”列等于“季度”,则返回“ C”中的值,如果“ A”列等于“季节”,则返回“ D”中的值,如果没有,则返回“ A”列中的值
A B C D E
0 month month+1 quarter+1 season+1 month+1
1 season month+5 quarter+3 season+2 season+2
2 day month+1 quarter+2 season+1 day
3 year month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month month+4 quarter+1 season+2 month+4
我在执行此操作时遇到了麻烦。我尝试过使用一个函数,但是没有用。看看我的尝试:
def f(row):
if row['A'] == 'month':
val = ['B']
elif row['A'] == 'quarter':
val = ['C']
elif row['A'] == 'season':
val = ['D']
else:
val = ['A']
return val
df['E'] = df.apply(f, axis=1)
已编辑:将最后一个else
更改为“ A”列
答案 0 :(得分:4)
拳头,我建议您看到:When should I want to use apply() in my code.
我会使用Series.replace
df['E'] = df['A'].replace(['month','quarter','season'],
[df['B'], df['C'], df['D']])
cond = [df['A'].eq('month'), df['A'].eq('quarter'), df['A'].eq('season')]
values= [df['B'], df['C'], df['D']]
df['E']=np.select(cond,values,default=df['A'])
A B C D E
0 month month+1 quarter+1 season+1 month+1
1 season month+5 quarter+3 season+2 season+2
2 day month+1 quarter+2 season+1 day
3 year month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month month+4 quarter+1 season+2 month+4
答案 1 :(得分:3)
只需使用np.select
c1 = df['A'] == 'month'
c2 = df['A'] == 'quarter'
c3 = df['A'] == 'season'
df['E'] = np.select([c1, c2, c3], [df['B'], df['C'], df['D']], df['A'])
Out[271]:
A B C D E
0 month month+1 quarter+1 season+1 month+1
1 season month+5 quarter+3 season+2 season+2
2 day month+1 quarter+2 season+1 day
3 year month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month month+4 quarter+1 season+2 month+4
答案 2 :(得分:1)
您可能需要像这样修复代码:
def f(row):
if row['A'] == 'month':
val = row['B']
elif row['A'] == 'quarter':
val = row['C']
elif row['A'] == 'season':
val = row['D']
else:
val = row['D']
return val
df['E'] = df.apply(f, axis=1)
注意:您忘了包含row
val = ['B'] # before
val = row['B'] # after
编辑:这只是为了指出代码中的问题,为了更好的方法,请检查与numpy.select
的使用有关的其他答案。