我有一个看起来像这样的数据框:
value timestamp
18.832939 2019-03-04 12:37:26 UTC
18.832939 2019-03-04 12:38:26 UTC
18.832939 2019-03-04 12:39:27 UTC
18.955200 2019-03-04 12:40:28 UTC
18.784912 2019-03-04 12:44:32 UTC
18.784912 2019-03-04 12:45:33 UTC
20.713936 2019-03-04 17:59:36 UTC
20.871742 2019-03-04 18:08:31 UTC
20.871742 2019-03-04 18:09:32 UTC
20.873871 2019-03-04 18:10:32 UTC
我希望得到以下结果,在其中确定所有大于2分钟但小于15分钟(2 这意味着,为了实现该目标,我必须做两件事: 我可以用它做第一个: 但是,我不确定该如何做第二部分,至少不是“类似熊猫”。最好以某种方式标识我打算填充的时间戳间隔的开始-结束,然后应用asfreq('1m'),然后使用该向量填充我想要的间隔。只是不确定如何正确地执行操作。 有人可以帮助我吗?提前致谢。 value timestamp
18.832939 2019-03-04 12:37:26 UTC
18.832939 2019-03-04 12:38:26 UTC
18.832939 2019-03-04 12:39:27 UTC
18.955200 2019-03-04 12:40:28 UTC
NaN 2019-03-04 12:41:28 UTC
NaN 2019-03-04 12:42:28 UTC
NaN 2019-03-04 12:43:28 UTC
18.784912 2019-03-04 12:44:32 UTC
18.784912 2019-03-04 12:45:33 UTC
20.713936 2019-03-04 17:59:36 UTC
NaN 2019-03-04 18:00:36 UTC
NaN 2019-03-04 18:01:36 UTC
NaN 2019-03-04 18:02:36 UTC
NaN 2019-03-04 18:03:36 UTC
NaN 2019-03-04 18:04:36 UTC
NaN 2019-03-04 18:05:36 UTC
NaN 2019-03-04 18:06:36 UTC
NaN 2019-03-04 18:07:36 UTC
20.871742 2019-03-04 18:08:31 UTC
20.871742 2019-03-04 18:09:32 UTC
20.873871 2019-03-04 18:10:32 UTC
df['aux_1'] = ((df['timestamp'].diff() > '0 days 00:02:00') & (df['timestamp'].diff() < '0 days 00:15:00')).astype(int) #get ending of the gap.
df['aux_2'] = df['aux_1'].shift(-1) #beginning of the gap.
df['intervals'] = df['aux_1'] + df['aux_2'] #both beginning and end with numeric consecutive flags contained in a single column.
答案 0 :(得分:1)
不是很喜欢熊猫,但我会做以下事情。
new_timestamp = []
for i, row in df.iterrows():
if row['aux_2']==0:
new_timestamp.append(row['timestamp'])
elif row['aux_2']==1:
new_timestamp += pd.date_range(row['timestamp'], df.iloc[i+1]['timestamp'], freq='min').to_list()
new_df = df.set_index('timestamp')
new_df = new_df.loc[new_timestamp]
这导致
print(new_df)
timestamp value aux_1 aux_2 intervals
2019-03-04 12:37:26+00:00 18.832939 0.0 0.0 0.0
2019-03-04 12:38:26+00:00 18.832939 0.0 0.0 0.0
2019-03-04 12:39:27+00:00 18.832939 0.0 0.0 0.0
2019-03-04 12:40:28+00:00 18.955200 0.0 1.0 1.0
2019-03-04 12:41:28+00:00 NaN NaN NaN NaN
2019-03-04 12:42:28+00:00 NaN NaN NaN NaN
2019-03-04 12:43:28+00:00 NaN NaN NaN NaN
2019-03-04 12:44:28+00:00 NaN NaN NaN NaN
2019-03-04 12:44:32+00:00 18.784912 1.0 0.0 1.0
2019-03-04 12:45:33+00:00 18.784912 0.0 0.0 0.0
2019-03-04 17:59:36+00:00 20.713936 0.0 1.0 1.0
2019-03-04 18:00:36+00:00 NaN NaN NaN NaN
2019-03-04 18:01:36+00:00 NaN NaN NaN NaN
2019-03-04 18:02:36+00:00 NaN NaN NaN NaN
2019-03-04 18:03:36+00:00 NaN NaN NaN NaN
2019-03-04 18:04:36+00:00 NaN NaN NaN NaN
2019-03-04 18:05:36+00:00 NaN NaN NaN NaN
2019-03-04 18:06:36+00:00 NaN NaN NaN NaN
2019-03-04 18:07:36+00:00 NaN NaN NaN NaN
2019-03-04 18:08:31+00:00 20.871742 1.0 0.0 1.0
2019-03-04 18:09:32+00:00 20.871742 0.0 0.0 0.0