Question

例如，我有一些这样的数据：

column = pd.Series([1,2,3,np.nan,4,np.nan,7])
print column

执行命令，结果如下：

现在我想知道每个NaN值之前的第一个值是什么，例如在第一个NaN之前的3.0。 4.0是第二个NaN值之前的结果。在pandas中是否有任何可以完成此功能的内置函数或者我是否应该编写for循环来执行此操作？

Answer 1

使用非连续NaN s的解决方案您可以使用由boolean indexing，isnull和shift创建的掩码fillna：

print (column[column.isnull().shift(-1).fillna(False)])
2    3.0
4    4.0
dtype: float64

print (column.isnull())
0    False
1    False
2    False
3     True
4    False
5     True
6    False
dtype: bool

print (column.isnull().shift(-1))
0    False
1    False
2     True
3    False
4     True
5    False
6      NaN
dtype: object

print (column.isnull().shift(-1).fillna(False))
0    False
1    False
2     True
3    False
4     True
5    False
6    False
dtype: bool

连续的NaN需要多个c mul {/ 3>}

column = pd.Series([np.nan,2,3,np.nan,np.nan,np.nan,7,np.nan, np.nan, 5,np.nan])

c = column.isnull()
mask = c.shift(-1).fillna(False).mul(~c)
print (mask)
0     False
1     False
2      True
3     False
4     False
5     False
6      True
7     False
8     False
9      True
10    False
dtype: bool

print (column[mask])
2    3.0
6    7.0
9    5.0
dtype: float64

Answer 2

与@jezrael相同的想法...... numpy fied。

column[np.append(np.isnan(column.values)[1:], False)]

2    3.0
4    4.0
dtype: float64

完成pd.Series重建

m = np.append(np.isnan(column.values)[1:], False)
pd.Series(column.values[m], column.index[m])

2    3.0
4    4.0
dtype: float64

不是那么快但直观。按cumsum的{{1}}分组并取最后一个值。在这个结果中，摆脱最后一行。

isnull

如何在Pandas的一列中找到Nan之前的第一个非NAN数据

2 个答案: