Python Pandas Upsampling - 复杂问题保留了一些np.Nans

时间:2018-02-27 00:13:11

标签: python pandas time-series resampling

我有一个如下所示的数据框:

              A        B

2010-01-01    6.5     3.2
2010-02-01    7.2     np.Nan
2010-03-01    8.1     np.Nan
2010-04-01    4.3     5.6
2010-05-01    3.7     6.1

我想上传到天数并转发填充值。但是在df['B']的情况下,我希望向前填充停止,其中np.Nan值开始。我正在寻找以下内容:

               A       B
2010-01-01    6.5     3.2
2010-01-02    6.5     3.2
 ....
2010-01-31    6.5     3.2
2010-02-01    7.2     np.Nan
2010-02-02    7.2     np.Nan
 ....   
2010-02-28    7.2     np.Nan
2010-03-01    8.1     np.Nan
2010-03-02    8.1     np.Nan
 ....
2010-03-31    8.1     np.Nan
2010-04-01    4.3     5.6
2010-04-02    4.3     5.6
 ....
2010-04-30    4.3     5.6
2010-05-01    3.7     6.1
2010-05-02    3.7     6.1
 ....
2010-05-31    3.7     6.1

如果我应用以下代码:

df['A'] = df['A'].resample('D').ffill()
df['B'] = df['B'].resample('D').ffill()

我的结果如下:

               A       B
2010-01-01    6.5     3.2
2010-01-02    6.5     3.2
 ....
2010-01-31    6.5     3.2
2010-02-01    7.2     3.2
2010-02-02    7.2     3.2
 ....   
2010-02-28    7.2     3.2
2010-03-01    8.1     3.2
2010-03-02    8.1     3.2
 ....
2010-03-31    8.1     3.2
2010-04-01    4.3     5.6
2010-04-02    4.3     5.6
 ....
2010-04-30    4.3     5.6
2010-05-01    3.7     6.1
2010-05-02    3.7     6.1
 ....
2010-05-31    3.7     6.1

df['B']已填充3.22010-01-012010-03-31的{​​{1}}值,而不是2010-01-31处的“停止”并保留np.Nan2010-02-012010-03-31

我知道我可以使用非常混乱的迭代过程来做到这一点。但有没有更简单的方法来做到这一点我没有看到?

感谢。

1 个答案:

答案 0 :(得分:0)

您可以将nan替换为其他值,等待我们replace

之后的resample
df=df.fillna('replaceNAN')
s=df.resample('D').ffill().replace('replaceNAN',np.nan)
s.loc[s.isnull().any(1)]
Out[456]: 
              A    B
2010-02-01  7.2  NaN
2010-02-02  7.2  NaN
2010-02-03  7.2  NaN
2010-02-04  7.2  NaN
2010-02-05  7.2  NaN
2010-02-06  7.2  NaN
2010-02-07  7.2  NaN
2010-02-08  7.2  NaN
2010-02-09  7.2  NaN
2010-02-10  7.2  NaN
2010-02-11  7.2  NaN
2010-02-12  7.2  NaN
2010-02-13  7.2  NaN
2010-02-14  7.2  NaN
2010-02-15  7.2  NaN
2010-02-16  7.2  NaN
2010-02-17  7.2  NaN
2010-02-18  7.2  NaN
2010-02-19  7.2  NaN
2010-02-20  7.2  NaN
2010-02-21  7.2  NaN
2010-02-22  7.2  NaN
2010-02-23  7.2  NaN
2010-02-24  7.2  NaN
2010-02-25  7.2  NaN
2010-02-26  7.2  NaN
2010-02-27  7.2  NaN
2010-02-28  7.2  NaN
2010-03-01  8.1  NaN
2010-03-02  8.1  NaN
2010-03-03  8.1  NaN
2010-03-04  8.1  NaN
2010-03-05  8.1  NaN
2010-03-06  8.1  NaN
2010-03-07  8.1  NaN
2010-03-08  8.1  NaN
2010-03-09  8.1  NaN
2010-03-10  8.1  NaN
2010-03-11  8.1  NaN
2010-03-12  8.1  NaN
2010-03-13  8.1  NaN
2010-03-14  8.1  NaN
2010-03-15  8.1  NaN
2010-03-16  8.1  NaN
2010-03-17  8.1  NaN
2010-03-18  8.1  NaN
2010-03-19  8.1  NaN
2010-03-20  8.1  NaN
2010-03-21  8.1  NaN
2010-03-22  8.1  NaN
2010-03-23  8.1  NaN
2010-03-24  8.1  NaN
2010-03-25  8.1  NaN
2010-03-26  8.1  NaN
2010-03-27  8.1  NaN
2010-03-28  8.1  NaN
2010-03-29  8.1  NaN
2010-03-30  8.1  NaN
2010-03-31  8.1  NaN