我想cols:A,B,C,D,E,F 即如果在col A ==''中,则使新的col G = col C,新的col H = col D,新的col I = col E 如果在col A!=''和col B =='some-value'中,则使col G = 0,col H = 0,col I = 0。 尝试使用np.where,但它仅支持任何想法的两个条件。
def change(dfr):
if (dfr['A']==''):
dfr['G'] = dfr['A']
dfr['H'] = dfr['B']
dfr['I'] = dfr['C']
if ((dfr['A']!='') & (dfr['B']=='some-value')):
dfr['G'] = dfr['A']
dfr['H'] = dfr['B']
dfr['I'] = dfr['C']
if ((dfr['A']!='') & (dfr['B']=='value')):
dfr['G'] = 0
dfr['H'] = 0
dfr['I'] = 0
答案 0 :(得分:0)
我不确定您是否需要if
语句。您可以使用.loc
完成此操作。这是一个玩具数据框:
data = pd.DataFrame({"A" : ['a', '', 'f', '4', '', 'z'],
"B" : ['f', 'y', 't', 'u', 'o', '1'],
"C" : ['a', 'b', 'c', 'd', 'e', 'f'],
"G" : [1, 1, 1, 1, 1, 1],
'H' : [6, 6, 6, 6, 6, 6],
"I" : ['q', 'q', 'q', 'q', 'q', 'q']})
data
A B C G H I
0 a f a 1 6 q
1 y b 1 6 q
2 f t c 1 6 q
3 4 u d 1 6 q
4 o e 1 6 q
5 z 1 f 1 6 q
为要在B列中检查的值构建几个参数可能是有意义的:
def change(dfr, b_firstvalue, b_secondvalue):
new_df = dfr.copy()
new_df.loc[new_df['A']=='', 'G'] = new_df['A']
new_df.loc[new_df['A']=='', 'H'] = new_df['B']
new_df.loc[new_df['A']=='', 'I'] = new_df['C']
new_df.loc[((new_df['A']!='') & (new_df['B'] == b_firstvalue)), 'G'] = new_df['A']
new_df.loc[((new_df['A']!='') & (new_df['B'] == b_firstvalue)), 'H'] = new_df['B']
new_df.loc[((new_df['A']!='') & (new_df['B'] == b_firstvalue)), 'I'] = new_df['C']
new_df.loc[((new_df['A']!='') & (new_df['B'] == b_secondvalue)), 'G'] = 0
new_df.loc[((new_df['A']!='') & (new_df['B'] == b_secondvalue)), 'H'] = 0
new_df.loc[((new_df['A']!='') & (new_df['B'] == b_secondvalue)), 'I'] = 0
return new_df
data2 = change(data, '1', 'f')
data2
A B C G H I
0 a f a 0 0 0
1 y b y b
2 f t c 1 6 q
3 4 u d 1 6 q
4 o e o e
5 z 1 f z 1 f
显然,该函数将完全取决于您要处理的列数。这只是示例问题的解决方案。如果您想使用更多列替换值,那么可能会有更有效的处理方法。