Question

我有以下数据框和列表值

import pandas as pd
import numpy as np
df_merge = pd.DataFrame({'column1': ['a', 'c', 'e'],
               'column2': ['b', 'd', 'f'],
               'column3': [0.5, 0.6, .04],
               'column4': [0.7, 0.8, 0.9]
               })

bb = ['b','h']
dd = ['d', 'I']
ff = ['f', 'l']

我正在尝试使用np.where和np.select代替IF FUNCTION：

condition = [((df_merge['column1'] == 'a') & (df_merge['column2'] == df_merge['column2'].isin(bb))),((df_merge['column1'] == 'c') & (df_merge['column2'] == df_merge['column2'].isin(dd))), ((df_merge['column1'] == 'e') & (df_merge['column2'] == df_merge['column2'].
isin(ff)))]

choices1 = [((np.where(df_merge['column3'] >= 1, 'should not have, ','correct')) & (np.where(df_merge['column4'] >= 0.45, 'should not have, ','correct')))]

df_merge['Reason'] = np.select(condition, choices1, default='correct')

但是，当我尝试运行choices1的代码行时，出现以下错误：

TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

我不确定我们是否可以在上述选择中使用np.where。

np.where应该应用于两列。预期输出如下：

df_merge = pd.DataFrame({'column1': ['a', 'c', 'e'],
               'column2': ['b', 'd', 'f'],
               'column3': [0.5, 0.6, .04],
               'column4': [0.7, 0.8, 0.9],
               'Reason': ['correct, should not have', 'correct, should not have', 'correct, should not have'],
               })

非常感谢任何帮助/指导/替代方法。

Answer 1

condition列表的首个长度必须与choices1相同，因此注释（删除）了长度2的最后一个条件。

然后，如果用isin进行比较，则输出是条件（掩码），因此与列比较没有意义。

最后一个问题是需要长度为2的列表，因此将&替换为,，并删除了choices1列表中的括号以避免元组：

condition = [(df_merge['column1'] == 'a') & df_merge['column2'].isin(bb),
             (df_merge['column1'] == 'c') & df_merge['column2'].isin(dd)
#             (df_merge['column1'] == 'e') & df_merge['column2'].isin(ff),
             ]

choices1 = [np.where(df_merge['column3'] >= 1, 'should not have','correct'),
            np.where(df_merge['column4'] >= 0.45, 'should not have','correct')]

df_merge['Reason'] = np.select(condition, choices1, default='correct')
print (df_merge)
  column1 column2  column3  column4           Reason
0       a       b     0.50      0.7          correct
1       c       d     0.60      0.8  should not have
2       e       f     0.04      0.9          correct

Numpy TypeError：输入类型不支持ufunc'bitwise_and'，

1 个答案: