根据pandas数据框中的条件为列分配值

时间:2018-01-02 14:46:31

标签: python pandas conditional-statements

我有以下数据集:

device_id   A   B   C   Current Class   
1           70  35  40     C                
2           45  90  34     B

现在每个设备在每个班级(A,B,C)中都有一个分数,它目前是某个班级的一部分。根据得分最高的班级,不论是否推荐班级变更。

例如,设备1属于C类,但它的最高分是A类,因此它的推荐等级为A.

预期产出:

device_id   A   B   C   Current Class   Class Change    Recommended
1           70  35  40  C                   Yes             A
2           45  90  34  B                   No              B

有人可以帮我解决这个问题吗?

3 个答案:

答案 0 :(得分:1)

我认为idxmax需要numpy.where

a = df[['A','B','C']].idxmax(axis=1)
#more general solution is select all columns without first and last
#a = df.iloc[:, 1:-1].idxmax(axis=1)
print (df.iloc[:, 1:-1])
    A   B   C
0  70  35  40
1  45  90  34

df['Class Change'] = np.where(df['Current Class'] == a, 'No', 'Yes')
df['Recommended'] = a
print (df)
   device_id   A   B   C Current Class Class Change Recommended
0          1  70  35  40             C          Yes           A
1          2  45  90  34             B           No           B

详情:

print (a)
0    A
1    B
dtype: object

如果新列的顺序不重要且应该交换:

df['Recommended'] = df[['A','B','C']].idxmax(1)
df['Class Change'] = np.where(df['Current Class'] == df['Recommended'], 'No', 'Yes')
print (df)
   device_id   A   B   C Current Class Recommended Class Change
0          1  70  35  40             C           A          Yes
1          2  45  90  34             B           B           No

答案 1 :(得分:1)

我首先会找到带有max的列来获取Recommended行,然后检查它是否与Current Class匹配以获取Class Change行,如下所示:

devices = pd.DataFrame({'A':[70, 45],
                       'B':[35, 90],
                       'C':[40, 34],
                       'Current Class':['C','B']})

devices['Recommended'] = devices[['A', 'B', 'C']].idxmax(1)

devices['Class Change'] = devices['Current Class'] == devices['Recommended']

print(devices)

输出:

    A   B   C Current Class Recommended  Class Change
0  70  35  40             C           A         False
1  45  90  34             B           B          True

答案 2 :(得分:1)

numpy解决方案: - )

df['Recommended']=np.array(list('ABC'))[np.argmax(df[list('ABC')].values,1)]
df
Out[172]: 
   device_id   A   B   C CurrentClass Recommended
0          1  70  35  40            C           A
1          2  45  90  34            B           B
(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
Out[173]: 
0     no
1    yes
dtype: object
df['Class Change']=(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
df
Out[175]: 
   device_id   A   B   C CurrentClass Recommended Class Change
0          1  70  35  40            C           A           no
1          2  45  90  34            B           B          yes