熊猫比较特定行

时间:2020-02-28 20:21:03

标签: python pandas dataframe

我有一个数据框,其中包含两列winner和newcol2。两列均包含变量white black和draw。我想比较列中的每一行。例如,如果获胜者:白色,而newcol2:黑色,则返回1

  winner    newcol2
0  white     black
1  black     white
2  white     white
3  draw      white
4  black     draw

conditions1 = [
(x['winner'] == 'white'),
(x['winner'] == 'draw'), 
(x['winner'] == 'black')]
conditions2 = [
(x['newcol2'] == 'white'),
(x['newcol2'] == 'draw'),
(x['newcol2'] == 'black')]

x['result'] = np.select(conditions1, conditions2, default='null')

我试图用下面的代码解决我的问题,但是我对等于和不等于的变量的判断是正确的还是错误的

2 个答案:

答案 0 :(得分:2)

据我了解,您想为DataFrame中两列的每个唯一组合分配一个值。

如果数据框中没有所有组合,则可以使用这种方法用代码创建字典,或使用itertools生成字典。

combs = set(zip(df['winner'], df['newcol2']))
codes = dict(zip(combs, range(len(combs))))

使用apply方法用编码值替换两列中的组合:

df['result'] = df.apply(lambda x: codes[x['winner'], x['newcol2']], axis=1)

答案 1 :(得分:0)

除了您提供的示例外,我不确定您的所有条件是什么,但这会起作用:

In [23]: conditions = []


In [24]: for row in df.itertuples(): 
...:     if row.winner == 'white' and row.newcol2 == 'black': 
...:         conditions.append(1) 
...:     elif row.winner == 'black' and row.newcol2 == 'white': 
...:         conditions.append(1) 
...:     else: 
...:         conditions.append(0) 
...:                                                                        

In [25]: conditions                                                             
Out[25]: [1, 1, 0, 0, 0]

In [26]: df['conditions'] = conditions                                          

In [27]: df                                                                     
Out[27]: 
  winner newcol2  conditions
0  white   black           1
1  black   white           1
2  white   white           0
3   draw   white           0
4  black    draw           0

您可以根据自己的条件修改代码