如何在多个DataFrame上应用if语句?

时间:2019-11-06 07:05:30

标签: python pandas dataframe

我有两个形状不同的数据框,我想对df1应用条件If语句,并从df2输入值。 Df1将有重复的行,但我需要填充它们,这意味着用特定列中df2的值替换-9值

df1:

Code 1    Name
2         Sam
5         James
7         Mark
6         Steven
-9        Michael
-9        Sarah
-9        Sam
5         James
-9        Mark
6         Steven
7         Michael
-9        Sarah
-9        Chris

df2:

Code 1    Name
20        Sam
30        James
40        Mark
50        Steven
70        Michael
45        Sarah

df1预期输出:

Code 1    Name
2         Sam
5         James
7         Mark
6         Steven
70        Michael
45        Sarah
20        Sam
5         James
40        Mark
6         Steven
7        Michael
45        Sarah
-9        Chris

2 个答案:

答案 0 :(得分:1)

对新系列使用Series.map,并按条件替换匹配的行,最后对不匹配的行,将丢失的值替换为-9

m = df1['Code 1'] == -9
df1.loc[m, 'Code 1'] = df1.loc[m, 'Name'].map(df2.set_index('Name')['Code 1'])
df1['Code 1'] = df1['Code 1'].fillna(-9).astype(int)
print (df1)
    Code 1     Name
0        2      Sam
1        5    James
2        7     Mark
3        6   Steven
4       70  Michael
5       45    Sarah
6       20      Sam
7        5    James
8       40     Mark
9        6   Steven
10       7  Michael
11      45    Sarah
12      -9    Chris

答案 1 :(得分:0)

以SQL的心态,考虑使用mergeassignnp.where条件逻辑(与SQL的JOINSELECT相对应)的基于集合的方法和CASE):

df1 = (df1.merge(df2, on="Name", how='left', suffixes=['','_'])
          .assign(Code1 = lambda x: (np.where((x['Code1'] == -9) & (pd.notna(x['Code1_'])), 
                                               x['Code1_'], 
                                               x['Code1'])).astype(int))
          .drop(columns = ['Code1_'])
      )

#     Code1     Name
# 0       2      Sam
# 1       5    James
# 2       7     Mark
# 3       6   Steven
# 4      70  Michael
# 5      45    Sarah
# 6      20      Sam
# 7       5    James
# 8      40     Mark
# 9       6   Steven
# 10      7  Michael
# 11     45    Sarah
# 12     -9    Chris

Online Demo (单击顶部的运行)