熊猫有条件地创建数据框列:基于多个条件

时间:2020-06-23 02:03:35

标签: python pandas numpy dataframe

我有一个df:

  col1 col2 col3
0    1    2    3
1    2    3    1
2    3    3    3
3    4    3    2

我要根据以下条件添加新列:

 - if   col1 > col2 > col3   ----->  2
 - elif col1 > col2          ----->  1
 - elif col1 < col2 < col3   -----> -2
 - elif col1 < col2          -----> -1
 - else                      ----->  0

它应该变成这样:

  col1 col2 col3   new
0    1    2    3   -2
1    2    3    1   -1
2    3    3    3    0
3    4    3    2    2

我遵循this post by unutbu中的方法,大于或小于1的是可以的。但是在我的情况下,大于或小于大于1,条件返回错误:

conditions = [
       (df['col1'] > df['col2'] > df['col3']), 
       (df['col1'] > df['col2']),
       (df['col1'] < df['col2'] < df['col3']),
       (df['col1'] < df['col2'])]
choices = [2,1,-2,-1]
df['new'] = np.select(conditions, choices, default=0)


Traceback (most recent call last):

  File "<ipython-input-43-768a4c0ecf9f>", line 2, in <module>
    (df['col1'] > df['col2'] > df['col3']),

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1478, in __nonzero__
    .format(self.__class__.__name__))

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我应该怎么做?

1 个答案:

答案 0 :(得分:2)

将代码更改为

conditions = [
       (df['col1'] > df['col2']) &  (df['col2'] > df['col3']), 
       (df['col1'] > df['col2']),
       (df['col1'] < df['col2']) & (df['col2'] < df['col3']),
       (df['col1'] < df['col2'])]
choices = [2,1,-2,-1]
df['new'] = np.select(conditions, choices, default=0)