我有带数据的csv文件。 Link is here. 2013年的时间序列的粒度为5分钟。但是,某些时间戳的值不足。
我想创建一个时间序列,间隔为5分钟,值为零,表示缺少时间戳。
请告知如何在Pandas或Python中执行此操作
答案 0 :(得分:0)
在pandas中,你只需加入索引:
from io import StringIO
import numpy as np
import pandas
ts1_string = StringIO("""\
V1,V2
01/01/2013 00:05:00,10
01/01/2013 00:10:00,6
01/01/2013 00:15:00,10
01/01/2013 00:25:00,8
01/01/2013 00:30:00,11
01/01/2013 00:35:00,7""")
ts2_string = StringIO("""
V1,V2
2013-01-01 00:00:00,0
2013-01-01 00:05:00,0
2013-01-01 00:10:00,0
2013-01-01 00:15:00,0
2013-01-01 00:20:00,0
2013-01-01 00:25:00,0""")
ts1 = pandas.read_csv(ts1_string, parse_dates=True, index_col='V1')
ts2 = pandas.read_csv(ts2_string, parse_dates=True, index_col='V1')
# here's where the join happens
# (suffixes deal with overlapping column names)
ts_joined = ts1.join(ts2, rsuffix='_ts1', lsuffix='_ts2')
# and finally
print(ts_joined.head())
给出了:
V2_ts2 V2_ts1
V1
2013-01-01 00:05:00 10 0
2013-01-01 00:10:00 6 0
2013-01-01 00:15:00 10 0
2013-01-01 00:25:00 8 0
2013-01-01 00:30:00 11 NaN