我的.asfreq是DataFrame pandas的一部分。我有stockdata在由股票的股票代码命名的文件中。包含数据的文件如下所示:
uri:/instrument/1.0/AAPL/chartdata;type=quote;range=1d/csv
ticker:aapl
Company-Name:Apple Inc.
Exchange-Name:NMS
unit:MIN
timezone:EST
currency:USD
gmtoffset:-18000
previous_close:114.6300
Timestamp:1417617000,1417640400
labels:1417618800,1417622400,1417626000,1417629600,1417633200,1417636800,1417640400
values:Timestamp,close,high,low,open,volume
close:115.1500,116.2500
high:115.2200,116.3500
low:115.1100,116.2000
open:115.1425,116.2450
volume:13400,3646700
1417617011,115.7498,115.8100,115.5707,115.7150,1622500
1417617060,115.6300,115.7500,115.5000,115.7300,284000
1417617179,115.3990,115.6600,115.3600,115.6500,349600
1417617180,115.6050,115.6400,115.3700,115.3990,300400
1417617299,115.7099,115.7700,115.6000,115.6401,279200
…
我的功能可以不时地拾取所有代码(例如[AAPL,NGF15])和拉出类型的数据(例如 - 'close')(例如 - ['2014-12-03 15:29:00','2014-12-03 16:31:00'])并在名为data的嵌套字典中更新它。在我调用函数后,嵌套字典看起来像这样:
{'AAPL':{'2014-12-03 16:03:00':'115.4200','2014-12-03 15:31:00':'115.6300','2014-12 -03 15:51:00':'116.1100','2014-12-03 16:08:00':'115.4100'...},'NGF15':{'2014-12-03 16:02:52': '3.8170','2014-12-03 16:14:58':'3.8000','2014-12-03 15:53:58':'3.8010','2014-12-03 15:33:59' :'3.7930','2014-12-03 15:59:58':'3.8110','2014-12-03 16:15:00':'3.8040',...}}
然后代码是这样的:
a=DataFrame(data=data)
a.index.name = 'vrime'
DataFrame看起来像这样:
AAPL NGF15
vrime
2014-12-03 15:29:59 NaN 3.7870
2014-12-03 15:30:11 115.7498 NaN
2014-12-03 15:30:54 NaN 3.7880
2014-12-03 15:31:00 115.6300 NaN
2014-12-03 15:31:57 NaN 3.7880
2014-12-03 15:32:58 NaN 3.7920
…
2014-12-03 16:21:59 115.5900 3.8090
…
所以我想改变每15秒的数据频率,在给定时间(如15:30:15)的价格是每个股票代码的最后价格。
a.index = pd.to_datetime(a.index)
print a.asfreq('15s', method=‘pad',how = {'2014-12-03 15:30:00','2014-12-03 16:30:00'})
所以我的结果如下:
AAPL NGF15
2014-12-03 15:29:59 NaN 3.7870
2014-12-03 15:30:14 115.7498 NaN
2014-12-03 15:30:29 115.7498 NaN
2014-12-03 15:30:44 115.7498 NaN
2014-12-03 15:30:59 NaN 3.7880
2014-12-03 15:31:14 115.6300 NaN
2014-12-03 15:31:29 115.6300 NaN
2014-12-03 15:31:44 115.6300 NaN
2014-12-03 15:31:59 NaN 3.7880
2014-12-03 15:32:14 NaN 3.7880
它从15:30:00开始并且当时只显示一个自动收报机。什么似乎是问题?
这就是我想要的:
AAPL NGF15
2014-12-03 15:30:15 115.7498 3.7870
2014-12-03 15:30:30 115.7498 3.7870
2014-12-03 15:30:45 115.7498 3.7870
2014-12-03 15:31:00 115.6300 3.7880
2014-12-03 15:31:15 115.6300 3.7880
2014-12-03 15:31:30 115.6300 3.7880
2014-12-03 15:31:45 115.6300 3.7880
2014-12-03 15:32:00 115.6300 3.7880
2014-12-03 15:32:15 115.6300 3.7880
提前谢谢!抱歉,如果英文不好!
答案 0 :(得分:0)
DataFrame.asfreq()
的{{3}}表示关键字如何运作"仅适用于PeriodIndex"。
asfreq()
只是resample()
的包装器。像resample('15s', fill_method='pad')
这样的东西应该有效。使用上面的一些缩写数据:
In [49]: data
Out[49]:
0 1
2014-12-03 15:29:59 NaN 3.7
2014-12-03 15:30:11 115.7 NaN
2014-12-03 15:30:54 NaN 3.8
2014-12-03 15:31:00 115.6 NaN
[4 rows x 2 columns]
In [50]: data.resample('15s', fill_method='pad')
Out[50]:
0 1
2014-12-03 15:29:45 NaN 3.7
2014-12-03 15:30:00 115.7 3.7
2014-12-03 15:30:15 115.7 3.7
2014-12-03 15:30:30 115.7 3.7
2014-12-03 15:30:45 115.7 3.8
2014-12-03 15:31:00 115.6 3.8
如果您想在15:30:00开始,可以从DataFrame中删除第一行。