我有以下数据框和列表值
import pandas as pd
import numpy as np
df_merge = pd.DataFrame({'column1': ['a', 'c', 'e'],
'column2': ['b', 'd', 'f'],
'column3': [0.5, 0.6, .04],
'column4': [0.7, 0.8, 0.9]
})
bb = ['b','h']
dd = ['d', 'I']
ff = ['f', 'l']
我正在尝试使用np.where和np.select代替IF FUNCTION:
condition = [((df_merge['column1'] == 'a') & (df_merge['column2'] == df_merge['column2'].isin(bb))),((df_merge['column1'] == 'c') & (df_merge['column2'] == df_merge['column2'].isin(dd))), ((df_merge['column1'] == 'e') & (df_merge['column2'] == df_merge['column2'].
isin(ff)))]
choices1 = [((np.where(df_merge['column3'] >= 1, 'should not have, ','correct')) & (np.where(df_merge['column4'] >= 0.45, 'should not have, ','correct')))]
df_merge['Reason'] = np.select(condition, choices1, default='correct')
但是,当我尝试运行choices1的代码行时,出现以下错误:
TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
我不确定我们是否可以在上述选择中使用np.where。
np.where应该应用于两列。预期输出如下:
df_merge = pd.DataFrame({'column1': ['a', 'c', 'e'],
'column2': ['b', 'd', 'f'],
'column3': [0.5, 0.6, .04],
'column4': [0.7, 0.8, 0.9],
'Reason': ['correct, should not have', 'correct, should not have', 'correct, should not have'],
})
非常感谢任何帮助/指导/替代方法。
答案 0 :(得分:0)
condition
列表的首个长度必须与choices1
相同,因此注释(删除)了长度2的最后一个条件。
然后,如果用isin
进行比较,则输出是条件(掩码),因此与列比较没有意义。
最后一个问题是需要长度为2的列表,因此将&
替换为,
,并删除了choices1
列表中的括号以避免元组:
condition = [(df_merge['column1'] == 'a') & df_merge['column2'].isin(bb),
(df_merge['column1'] == 'c') & df_merge['column2'].isin(dd)
# (df_merge['column1'] == 'e') & df_merge['column2'].isin(ff),
]
choices1 = [np.where(df_merge['column3'] >= 1, 'should not have','correct'),
np.where(df_merge['column4'] >= 0.45, 'should not have','correct')]
df_merge['Reason'] = np.select(condition, choices1, default='correct')
print (df_merge)
column1 column2 column3 column4 Reason
0 a b 0.50 0.7 correct
1 c d 0.60 0.8 should not have
2 e f 0.04 0.9 correct