我有以下代码:
import pandas as pd
from pandas import datetime
from pandas import DataFrame as df
import matplotlib
from pandas_datareader import data as web
import matplotlib.pyplot as plt
import datetime
import fxcmpy
import numpy as np
print(con.get_instruments())
symbols = con.get_instruments()
ticker = 'NGAS'
start = datetime.datetime(2015,1,1)
end = datetime.datetime.today()
data1= con.get_candles(ticker, period='m1', number=10000)
data.index = pd.to_datetime(data.index, format ='%Y-%m-%d %H:%M %S')
data['hour'] = data.index.hour
data['minute'] = data.index.minute
data.set_index(['hour', 'minute'], inplace=True)
这给了我以下输出:
bidopen bidclose bidhigh bidlow askopen askclose askhigh asklow tickqty
hour minute
10 52 2.2400 2.2395 2.2395 2.2390 2.2475 2.2470 2.2475 2.2470 3
53 2.2395 2.2415 2.2415 2.2395 2.2470 2.2490 2.2490 2.2475 8
54 2.2415 2.2415 2.2415 2.2410 2.2490 2.2490 2.2490 2.2485 4
56 2.2415 2.2415 2.2415 2.2415 2.2490 2.2490 2.2490 2.2490 2
57 2.2415 2.2410 2.2415 2.2400 2.2490 2.2485 2.2490 2.2480 8
... ... ... ... ... ... ... ... ... ... ...
21 39 2.3385 2.3385 2.3395 2.3380 2.3465 2.3460 2.3470 2.3460 10
41 2.3385 2.3375 2.3385 2.3370 2.3460 2.3460 2.3460 2.3460 4
42 2.3375 2.3365 2.3385 2.3360 2.3460 2.3440 2.3460 2.3440 10
43 2.3365 2.3375 2.3385 2.3360 2.3440 2.3450 2.3460 2.3440 15
44 2.3375 2.3365 2.3375 2.3360 2.3450 2.3445 2.3450 2.3440 5
10000 rows × 9 columns
我想做的是,以这样的一种方式获取bidlow
的均值,即我在同一表中每小时1分钟的平均出价较低,而1小时bidlow
的平均值为21小时的44分钟。我该怎么办?
答案 0 :(得分:2)
我认为这里最好与功能DataFrame.between_time
的DatetimeIndex
一起使用:
data = con.get_candles(ticker, period='m1', number=10000)
data1= con.get_candles(ticker, period='m1', number=10000)
#already DatetimeIndex, so not necessary converting
#data.index = pd.to_datetime(data.index, format ='%Y-%m-%d %H:%M %S')
data['hour'] = data.index.hour
data['minute'] = data.index.minute
#print (data)
两次之间的第一个过滤器:
data2 = data.between_time('01:01:00', '21:44:00').copy()
print (data2)
bidopen bidclose bidhigh bidlow askopen askclose \
date
2019-12-10 10:52:00 2.2400 2.2395 2.2395 2.2390 2.2475 2.2470
2019-12-10 10:53:00 2.2395 2.2415 2.2415 2.2395 2.2470 2.2490
2019-12-10 10:54:00 2.2415 2.2415 2.2415 2.2410 2.2490 2.2490
2019-12-10 10:56:00 2.2415 2.2415 2.2415 2.2415 2.2490 2.2490
2019-12-10 10:57:00 2.2415 2.2410 2.2415 2.2400 2.2490 2.2485
... ... ... ... ... ...
2019-12-20 21:39:00 2.3385 2.3385 2.3395 2.3380 2.3465 2.3460
2019-12-20 21:41:00 2.3385 2.3375 2.3385 2.3370 2.3460 2.3460
2019-12-20 21:42:00 2.3375 2.3365 2.3385 2.3360 2.3460 2.3440
2019-12-20 21:43:00 2.3365 2.3375 2.3385 2.3360 2.3440 2.3450
2019-12-20 21:44:00 2.3375 2.3365 2.3375 2.3360 2.3450 2.3445
askhigh asklow tickqty hour minute
date
2019-12-10 10:52:00 2.2475 2.2470 3 10 52
2019-12-10 10:53:00 2.2490 2.2475 8 10 53
2019-12-10 10:54:00 2.2490 2.2485 4 10 54
2019-12-10 10:56:00 2.2490 2.2490 2 10 56
2019-12-10 10:57:00 2.2490 2.2480 8 10 57
... ... ... ... ...
2019-12-20 21:39:00 2.3470 2.3460 10 21 39
2019-12-20 21:41:00 2.3460 2.3460 4 21 41
2019-12-20 21:42:00 2.3460 2.3440 10 21 42
2019-12-20 21:43:00 2.3460 2.3440 15 21 43
2019-12-20 21:44:00 2.3450 2.3440 5 21 44
然后每小时每小时汇总mean
:
data3 = data2.groupby(['hour','minute'], as_index=False)['bidlow'].mean()
print (data3)
hour minute bidlow
0 1 1 2.290750
1 1 2 2.316000
2 1 3 2.305071
3 1 4 2.304857
4 1 5 2.302125
... ... ...
1239 21 40 2.284167
1240 21 41 2.328000
1241 21 42 2.323400
1242 21 43 2.291100
1243 21 44 2.315786
[1244 rows x 3 columns]