我有以下数据集:
device_id A B C Current Class
1 70 35 40 C
2 45 90 34 B
现在每个设备在每个班级(A,B,C)中都有一个分数,它目前是某个班级的一部分。根据得分最高的班级,不论是否推荐班级变更。
例如,设备1属于C类,但它的最高分是A类,因此它的推荐等级为A.
预期产出:
device_id A B C Current Class Class Change Recommended
1 70 35 40 C Yes A
2 45 90 34 B No B
有人可以帮我解决这个问题吗?
答案 0 :(得分:1)
我认为idxmax
需要numpy.where
:
a = df[['A','B','C']].idxmax(axis=1)
#more general solution is select all columns without first and last
#a = df.iloc[:, 1:-1].idxmax(axis=1)
print (df.iloc[:, 1:-1])
A B C
0 70 35 40
1 45 90 34
df['Class Change'] = np.where(df['Current Class'] == a, 'No', 'Yes')
df['Recommended'] = a
print (df)
device_id A B C Current Class Class Change Recommended
0 1 70 35 40 C Yes A
1 2 45 90 34 B No B
详情:
print (a)
0 A
1 B
dtype: object
如果新列的顺序不重要且应该交换:
df['Recommended'] = df[['A','B','C']].idxmax(1)
df['Class Change'] = np.where(df['Current Class'] == df['Recommended'], 'No', 'Yes')
print (df)
device_id A B C Current Class Recommended Class Change
0 1 70 35 40 C A Yes
1 2 45 90 34 B B No
答案 1 :(得分:1)
我首先会找到带有max的列来获取Recommended
行,然后检查它是否与Current Class
匹配以获取Class Change
行,如下所示:
devices = pd.DataFrame({'A':[70, 45],
'B':[35, 90],
'C':[40, 34],
'Current Class':['C','B']})
devices['Recommended'] = devices[['A', 'B', 'C']].idxmax(1)
devices['Class Change'] = devices['Current Class'] == devices['Recommended']
print(devices)
输出:
A B C Current Class Recommended Class Change
0 70 35 40 C A False
1 45 90 34 B B True
答案 2 :(得分:1)
numpy
解决方案: - )
df['Recommended']=np.array(list('ABC'))[np.argmax(df[list('ABC')].values,1)]
df
Out[172]:
device_id A B C CurrentClass Recommended
0 1 70 35 40 C A
1 2 45 90 34 B B
(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
Out[173]:
0 no
1 yes
dtype: object
df['Class Change']=(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
df
Out[175]:
device_id A B C CurrentClass Recommended Class Change
0 1 70 35 40 C A no
1 2 45 90 34 B B yes