Question

根据单独的列中的nan值创建新列的最有效方法是什么（考虑到数据框非常大）在OTW中，如果任一列中的某一行中有1，则新列的对应值应为X A B 1 2 3 4 NaN 1 7 8 9 3 2 NaN 5 NaN 2

注意：列的dtypes可能是不同的对象，而不仅仅是整数/浮点数

X A   B    C
1 2   3    0
4 NaN 1    1
7 8   9    0
3 2   NaN  1
5 NaN 2    1

应该给

df['C'] = np.where(np.any(np.isnan(df[['A', 'B']])), 1, 0)

尝试过的代码（感谢一些在线帮助）：

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

但是会引发以下错误

df['C'] = np.where(np.any(pd.isnull(df[['A', 'B']])), 1, 0)

这将返回一个空的数据帧（因为A和B列在单行中都没有NaN值

df['C1'] = np.where(np.isnan(df['A'].values), 1, 0) 
df['C2'] = np.where(np.isnan(df['B'].values), 1, 0)
df['C'] = df[['C1','C2']].max(axis=1)

找到了解决方法：

C1

然后您可以放下C2和sed

希望这会有所帮助〜

Answer 1

您在axis=1中缺少any

np.where(np.any(np.isnan(df[['A', 'B']]),axis=1), 1, 0)
Out[80]: array([0, 1, 0, 1, 1])

Answer 2

这比您想象的要简单。希望对您有所帮助！

df['C'] = df.isna().sum(axis=1).apply(lambda x: 0 if x==0 else 1)

根据其他列中的“ NaN”值在Pandas Dataframe中创建一个新列

2 个答案: