Question

    A           B           C           D
0   0.397333    Xor         0.569748    0.406415
1   0.319684    x           0.159117    0.522648
2   0.778038    0.486989    x           x
3   0.549993    0.896913    0.960814    0.430113
4   0.251655    0.802137    Xand        0.218265

在这里，我需要比较所有四列，我需要一个新的column E，我将获得我的新信息。

我需要检查包含x的{{1}}中的任何一列是否具有值Column E其他Yes。

输出

No

我想在这里使用where子句但是我无法做到这一点而且lambda也无法理解我应该怎么写。

这是我的代码：

    A           B           C           D          E
0   0.397333    Xor         0.569748    0.406415   No
1   0.319684    x           0.159117    0.522648   Yes
2   0.778038    0.486989    x           x          Yes
3   0.549993    0.896913    0.960814    0.430113   No
4   x           0.802137    Xand        0.218265   Yes

错误：

def YorN(stri):
    if stri =='x':
        return True
    else:
        return False

df['E'] = np.where(YorN(df.B) | YorN(df.C) | YorN(df.D)| YorN(df.A), 'Yes', 'No')

修改1 我的其他列可能包含一些其他变量

Answer 1

您的比较功能将无法正常工作，您尝试将标量与数组进行比较。无论如何，您可以调用apply并传递axis=1来处理df行。将dtype转换为str，以便您可以使用带有str.contains的向量化any来生成布尔序列，并将其用作np.where的arg并返回'yes'或分别为True或False时为“否”：

In [8]:
df['E'] = np.where(df.astype(str).apply(lambda x: x.str.contains('x').any(), axis=1), 'yes', 'no')
df

Out[8]:
          A         B         C         D    E
0  0.397333  0.245596  0.569748  0.406415   no
1  0.319684         x  0.159117  0.522648  yes
2  0.778038  0.486989         x         x  yes
3  0.549993  0.896913  0.960814  0.430113   no
4  0.251655  0.802137  0.024341  0.218265   no

修改

答案仍有效：

In [10]: df['E'] = np.where(df.astype(str).apply(lambda x: x.str.contains('x').any(), axis=1), 'yes', 'no') df Out[10]: A B C D E 0 0.397333 Xor 0.569748 0.406415 no 1 0.319684 x 0.159117 0.522648 yes 2 0.778038 0.486989 x x yes 3 0.549993 0.896913 0.960814 0.430113 no 4 0.251655 0.802137 Xand 0.218265 no

如何比较pandas中的列并使用yes或no创建新列

1 个答案: