填充不满足最小间隔的数据帧

时间:2014-04-21 05:44:22

标签: python pandas

我有大约2到3分钟的一系列数据。有时由于某人关闭了监控软件而导致数据存在巨大差距,比如几个小时,如果有5分钟的缺失数据间隔,我想用无效标记填补这些空白,所以我可以相应地提供数据。我怎么能这样做?

编辑:例如

Time a b c d
12:01 1 1 0 1
12:10 1 0 0 0

类似

Time a b c d
12:01 1 1 0 1
12:06 -1 -1 -1 -1 or None or NaN
12:10 1 0 0 0

因此,图表中没有数据的部分显示为灰色,而不是简单地连接不连续数据。

不确定无效符号应该是什么,熊猫喜欢什么或与绘图库协同作用

1 个答案:

答案 0 :(得分:0)

我假设你每隔几分钟就想要NaN,而不是只有一个NaN,并且你不介意添加NaN的位置没有差距,只要它们也加入了差距。如果此解决方案符合您的要求,请告诉我们:

# Imports

from datetime import datetime

import numpy as np
import pandas as pd

# Setup

initial_index = [datetime(2014,4,21,12,x) for x in [0,5,8,14,18,21,25]]
columns = ['A','B','C','D']

df = pd.DataFrame(np.random.randn(7,4), index=initial_index, columns=columns)

# The actual solution

regular_interval_index = pd.date_range('12:00:00', '13:00:00', freq='5Min')

df_reindexed  = df.reindex(df.index + regular_interval_index)

区别:

print df
                            A         B         C         D
2014-04-21 12:00:00  0.422272  0.539352 -0.401912  0.163993
2014-04-21 12:05:00  0.896098 -0.396894 -1.356148  0.724784
2014-04-21 12:08:00 -0.882721 -0.820098  0.154705 -0.706515
2014-04-21 12:14:00 -0.008495 -0.326866  1.115965 -1.559558
2014-04-21 12:18:00  0.117228  0.030347  1.049639 -0.536378
2014-04-21 12:21:00 -0.762874 -1.592967 -0.088216 -0.897630
2014-04-21 12:25:00 -0.483685  1.298545 -0.008885 -0.481165

[7 rows x 4 columns]

print df_reindexed
                            A         B         C         D
2014-04-21 12:00:00  0.422272  0.539352 -0.401912  0.163993
2014-04-21 12:05:00  0.896098 -0.396894 -1.356148  0.724784
2014-04-21 12:08:00 -0.882721 -0.820098  0.154705 -0.706515
2014-04-21 12:10:00       NaN       NaN       NaN       NaN
2014-04-21 12:14:00 -0.008495 -0.326866  1.115965 -1.559558
2014-04-21 12:15:00       NaN       NaN       NaN       NaN
2014-04-21 12:18:00  0.117228  0.030347  1.049639 -0.536378
2014-04-21 12:20:00       NaN       NaN       NaN       NaN
2014-04-21 12:21:00 -0.762874 -1.592967 -0.088216 -0.897630
2014-04-21 12:25:00 -0.483685  1.298545 -0.008885 -0.481165
2014-04-21 12:30:00       NaN       NaN       NaN       NaN
2014-04-21 12:35:00       NaN       NaN       NaN       NaN
2014-04-21 12:40:00       NaN       NaN       NaN       NaN
2014-04-21 12:45:00       NaN       NaN       NaN       NaN
2014-04-21 12:50:00       NaN       NaN       NaN       NaN
2014-04-21 12:55:00       NaN       NaN       NaN       NaN
2014-04-21 13:00:00       NaN       NaN       NaN       NaN

[17 rows x 4 columns]