我有一个如下形式的简单数据集:
import pandas as pd
df = pd.DataFrame(
[
["Norway" , 7.537, 0.039, 11 , 31],
["Denmark" , 7.522, -0.004, 9 , 12],
["Switzerland", 7.494, None , 15 , 50],
["Finland" , 7.469, None , None, 29],
["Netherlands", 7.377, 1 , None, 77],
],
columns = [
"country",
"score A",
"score B",
"score C",
"score D"
]
)
如何过滤此数据集,以便将某些条件放在多行的值上?那么,假设我想过滤数据,以便排除score B
和 score C
的空值的所有行(所有国家/地区)?这将导致排除Finland
行。
当我尝试以下操作时,我会在排除的任一列中获取包含任何空值的所有行,从而只包含Norway
和Denmark
行:
df[(df["score B"].notnull()) & (df["score C"].notnull())]
如何做到这一点?
答案 0 :(得分:1)
如何指定or
:
df[(df["score B"].notnull()) | (df["score C"].notnull())]
输出:
country score A score B score C score D
0 Norway 7.537 0.039 11.0 31
1 Denmark 7.522 -0.004 9.0 12
2 Switzerland 7.494 NaN 15.0 50
4 Netherlands 7.377 1.000 NaN 77
右?您想要的只是排除两者为空(或者我没有正确理解这一点)的情况?
答案 1 :(得分:1)
你需要
df[~(df['score B'].isnull() & df['score C'].isnull())]
country score A score B score C score D
0 Norway 7.537 0.039 11.0 31
1 Denmark 7.522 -0.004 9.0 12
2 Switzerland 7.494 NaN 15.0 50
4 Netherlands 7.377 1.000 NaN 77