我有大约2到3分钟的一系列数据。有时由于某人关闭了监控软件而导致数据存在巨大差距,比如几个小时,如果有5分钟的缺失数据间隔,我想用无效标记填补这些空白,所以我可以相应地提供数据。我怎么能这样做?
编辑:例如
Time a b c d
12:01 1 1 0 1
12:10 1 0 0 0
类似
Time a b c d
12:01 1 1 0 1
12:06 -1 -1 -1 -1 or None or NaN
12:10 1 0 0 0
因此,图表中没有数据的部分显示为灰色,而不是简单地连接不连续数据。
不确定无效符号应该是什么,熊猫喜欢什么或与绘图库协同作用
答案 0 :(得分:0)
我假设你每隔几分钟就想要NaN
,而不是只有一个NaN
,并且你不介意添加NaN
的位置没有差距,只要它们也加入了差距。如果此解决方案符合您的要求,请告诉我们:
# Imports
from datetime import datetime
import numpy as np
import pandas as pd
# Setup
initial_index = [datetime(2014,4,21,12,x) for x in [0,5,8,14,18,21,25]]
columns = ['A','B','C','D']
df = pd.DataFrame(np.random.randn(7,4), index=initial_index, columns=columns)
# The actual solution
regular_interval_index = pd.date_range('12:00:00', '13:00:00', freq='5Min')
df_reindexed = df.reindex(df.index + regular_interval_index)
区别:
print df
A B C D
2014-04-21 12:00:00 0.422272 0.539352 -0.401912 0.163993
2014-04-21 12:05:00 0.896098 -0.396894 -1.356148 0.724784
2014-04-21 12:08:00 -0.882721 -0.820098 0.154705 -0.706515
2014-04-21 12:14:00 -0.008495 -0.326866 1.115965 -1.559558
2014-04-21 12:18:00 0.117228 0.030347 1.049639 -0.536378
2014-04-21 12:21:00 -0.762874 -1.592967 -0.088216 -0.897630
2014-04-21 12:25:00 -0.483685 1.298545 -0.008885 -0.481165
[7 rows x 4 columns]
print df_reindexed
A B C D
2014-04-21 12:00:00 0.422272 0.539352 -0.401912 0.163993
2014-04-21 12:05:00 0.896098 -0.396894 -1.356148 0.724784
2014-04-21 12:08:00 -0.882721 -0.820098 0.154705 -0.706515
2014-04-21 12:10:00 NaN NaN NaN NaN
2014-04-21 12:14:00 -0.008495 -0.326866 1.115965 -1.559558
2014-04-21 12:15:00 NaN NaN NaN NaN
2014-04-21 12:18:00 0.117228 0.030347 1.049639 -0.536378
2014-04-21 12:20:00 NaN NaN NaN NaN
2014-04-21 12:21:00 -0.762874 -1.592967 -0.088216 -0.897630
2014-04-21 12:25:00 -0.483685 1.298545 -0.008885 -0.481165
2014-04-21 12:30:00 NaN NaN NaN NaN
2014-04-21 12:35:00 NaN NaN NaN NaN
2014-04-21 12:40:00 NaN NaN NaN NaN
2014-04-21 12:45:00 NaN NaN NaN NaN
2014-04-21 12:50:00 NaN NaN NaN NaN
2014-04-21 12:55:00 NaN NaN NaN NaN
2014-04-21 13:00:00 NaN NaN NaN NaN
[17 rows x 4 columns]