Question

我需要从第一个非空白单元之前的单元格开始获取熊猫系列的子集。

例如：对于该系列：

>>> s = pd.Series([np.NaN, np.NaN, 1], index=['a', 'b', 'c'])
>>> s

a    NaN
b    NaN
c    1.0
dtype: float64

我需要获取包含行“ b”和“ c”的子集。像这样：

b    NaN
c    1.0
dtype: float64

我有以下代码：

import pandas as pd
import numpy as np

s = pd.Series([np.NaN, np.NaN, 1], index=['a', 'b', 'c'])
lst = s.index.to_list()
s[lst[lst.index(s.first_valid_index())-1:]]

是否有更简单和/或更快速的方法来做到这一点？ 请注意，数据可能包含空白而不是NA。

Answer 1

使用get_loc（而且您也不必再依赖let）和first_valid_index，这会更容易理解：

s[s.index.get_loc(s.first_valid_index())-1:]

b    NaN
c    1.0
dtype: float64

这将在您的索引值唯一的情况下起作用。

要处理空格，请使用replace，

s2 = pd.Series(['', np.NaN, 1], index=['a', 'b', 'c'])
s2[s2.index.get_loc(s2.replace('', np.nan).first_valid_index())-1:]

b    NaN
c      1
dtype: object

Answer 2

我将使用idxmax和bfill

s[s.loc[:s.idxmax()].bfill(limit=1).notna()]
b    NaN
c    1.0
dtype: float64

从第一个非空白之前获取列的子集

2 个答案: