这是数据框的子集:
index id drug sentences SS1 SS2
1 2 lex very bad 0 1
2 3 gym very nice 1 1
3 7 effex hard 1 0
4 8 cymba poor 1 1
我想找到SS1和SS2不同的行,然后根据它创建一个新的数据帧。输出应该是这样的:
index id drug sentences SS1 SS2
1 2 lex very bad 0 1
3 7 effex hard 1 0
这是我的代码:
df [['index','id', 'drug', 'sentences', 'SS1', 'SS2' ]] = np.where(df.SS1 != df.SS2)
但它有以下错误:ValueError: Must have equal len keys and value when setting with an ndarray
有什么建议吗?
答案 0 :(得分:5)
可能有以下一种方式:
df_new = df[df.SS1 != df.SS2]
print(df_new)
输出:
index id drug sentences SS1 SS2
0 1 2 lex very bad 0 1
2 3 7 effex hard 1 0
使用where
:
df_new = df.where(df.SS1 != df.SS2).dropna()
print(df_new)