我想按组筛选DataFrame,因为 a 之后的nan
应该是 a (这类似于标记) ,nans
后跟 b ,也是 b 。
我有一个简短的例子:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'group1': ['a',nan,nan,nan,nan,'b',nan,nan,nan,nan],
'value1': [0.4,1.1,2,3,4,5,6,7,8,8.8],
'value2': [6.4, 6.9,7.1,8,9,10,11,12,13,14]
})
我想要的输出是:
In [3]: df[df.group1 == 'a']
Out[3]:
group1 value1 value2
0 a 0.4 6.4
1 NaN 1.1 6.9
2 NaN 2.0 7.1
3 NaN 3.0 8.0
4 NaN 4.0 9.0
我会暗示任何提示!
答案 0 :(得分:1)
您可以使用ffill
向前填充列:
>>> df[df['group1'].fillna(method='ffill') == 'a']
group1 value1 value2
0 a 0.4 6.4
1 NaN 1.1 6.9
2 NaN 2.0 7.1
3 NaN 3.0 8.0
4 NaN 4.0 9.0
但是,或许更好的解决方案是在原始数据框上转发填充列:
>>> df['group1'].fillna(method='ffill', inplace=True)
>>> df[df['group1'] == 'a']