采样所需的数据来自SQLite。它已在此处提供:https://pastebin.com/LU7YApkX
代码:
import sqlite3
import pandas as pd
conn = sqlite3.connect('sqlite_database.db')
query = "SELECT * FROM XXXX WHERE timestamp BETWEEN '2019-01-24 09:15:00' AND '2019-01-24 09:59:59'"
df = pd.read_sql_query(query, conn, index_col=[
'timestamp'], parse_dates=['timestamp'])
candles = df['ltp'].resample('5min').ohlc().bfill()
print(candles)
输出良好 (Resample period = 3min)
:
$ python3 why_ohlc_failing.py
open high low close
timestamp
2019-01-24 09:15:00 286.55 286.70 285.85 286.20
2019-01-24 09:18:00 286.10 286.30 285.50 285.90
2019-01-24 09:21:00 285.90 286.25 285.65 285.85
2019-01-24 09:24:00 285.80 286.90 285.75 286.65
2019-01-24 09:27:00 286.65 286.85 286.35 286.60
2019-01-24 09:30:00 286.70 286.70 286.20 286.25
2019-01-24 09:33:00 286.25 286.95 286.20 286.95
2019-01-24 09:36:00 287.00 287.50 286.95 287.40
2019-01-24 09:39:00 287.45 287.50 287.00 287.45
2019-01-24 09:42:00 287.35 287.50 287.00 287.50
2019-01-24 09:45:00 287.40 288.15 287.40 288.05
2019-01-24 09:48:00 288.40 288.45 288.30 288.35
2019-01-24 09:51:00 288.40 288.45 288.30 288.35
2019-01-24 09:54:00 288.40 288.45 288.30 288.35
2019-01-24 09:57:00 288.40 288.45 288.30 288.35
输出良好 (Resample period = 5min)
:
$ python3 why_ohlc_failing.py
open high low close
timestamp
2019-01-24 09:15:00 286.55 286.70 285.5 285.65
2019-01-24 09:20:00 285.65 286.25 285.6 285.95
2019-01-24 09:25:00 285.95 286.90 285.9 286.60
2019-01-24 09:30:00 286.70 286.70 286.2 286.60
2019-01-24 09:35:00 286.70 287.50 286.6 287.15
2019-01-24 09:40:00 287.15 287.50 287.0 287.50
2019-01-24 09:45:00 287.40 288.15 287.4 288.05
2019-01-24 09:50:00 288.40 288.45 288.3 288.35
2019-01-24 09:55:00 288.40 288.45 288.3 288.35
输出不良 (Resample period = 10min)
:
$ python3 why_ohlc_failing.py
open high low close
timestamp
2019-01-24 09:10:00 286.55 286.70 285.5 285.65
2019-01-24 09:20:00 285.65 286.90 285.6 286.60
2019-01-24 09:30:00 286.70 287.50 286.2 287.15
2019-01-24 09:40:00 287.15 288.15 287.0 288.05
2019-01-24 09:50:00 288.40 288.45 288.3 288.35
输出良好 (Resample period = 15min)
:
$ python3 why_ohlc_failing.py
open high low close
timestamp
2019-01-24 09:15:00 286.55 286.90 285.5 286.60
2019-01-24 09:30:00 286.70 287.50 286.2 287.50
2019-01-24 09:45:00 287.40 288.45 287.4 288.35
输出不良 (Resample period = 20min)
:
$ python3 why_ohlc_failing.py
open high low close
timestamp
2019-01-24 09:00:00 286.55 286.70 285.5 285.65
2019-01-24 09:20:00 285.65 287.50 285.6 287.15
2019-01-24 09:40:00 287.15 288.45 287.0 288.35
问题:
如果您查看上面10min
和20min
的采样周期中所有 BAD 输出,则从2019-01-24 09:10:00
和2019-01-24 09:00:00
开始。
这是错误的,因为在2019-01-24 09:15:01
之前我什至没有任何数据。
但是,对于3min
,5min
和15min
的采样周期,相同的代码也可以正常工作。
您能帮我弄清楚这里出什么问题了吗?我的理解与采样周期无关,重新采样的数据应始终以2019-01-24 09:15:00
开头,否则没有任何意义,因为在此之前没有可用的股票报价。
答案 0 :(得分:1)
重新采样时,例如到10min
为止,它会创建10分钟的间隔,而2019-01-24 09:10:00
对应于2019-01-24 09:10:00 - 2019-01-24 09:19:59
:
df['ltp'].resample('10min').ohlc().bfill()
输出:
open high low close
t
2019-01-24 09:10:00 286.55 286.70 285.5 285.65
2019-01-24 09:20:00 285.65 286.90 285.6 286.60
2019-01-24 09:30:00 286.70 287.50 286.2 287.15
2019-01-24 09:40:00 287.15 288.15 287.0 288.05
2019-01-24 09:50:00 288.40 288.45 288.3 288.35
与:
print(
df.loc['2019-01-24 09:10:00':'2019-01-24 09:19:59', 'ltp'].iloc[0],
df.loc['2019-01-24 09:10:00':'2019-01-24 09:19:59', 'ltp'].max(),
df.loc['2019-01-24 09:10:00':'2019-01-24 09:19:59', 'ltp'].min(),
df.loc['2019-01-24 09:10:00':'2019-01-24 09:19:59', 'ltp'].iloc[-1])
输出:
286.55 286.7 285.5 285.65
注意:如果您想以第一个值开始重新采样的数据:
tmin = df.index[0]
df.index = df.index - tmin
df = df.resample('10min').ohlc().bfill()
df.index = df.index + tmin
df
输出:
ltp
open high low close
t
2019-01-24 09:15:01 286.55 286.70 285.5 285.95
2019-01-24 09:25:01 285.95 286.90 285.9 286.70
2019-01-24 09:35:01 286.65 287.50 286.6 287.50
2019-01-24 09:45:01 287.40 288.15 287.4 288.05
2019-01-24 09:55:01 288.40 288.45 288.3 288.35
答案 1 :(得分:0)
以下在所有间隔下均可正常工作:
false
我不得不添加 bool
,尽管我仍在尝试了解此处的情况。
我进一步发现,要在各种采样周期内获得理想的结果,我需要添加各种data = df['ltp'].resample('5min', base=15).ohlc().bfill()
值,如下所示:
base=15
对于base
,resample('1min', base=15)
resample('2min', base=15)
resample('3min', base=15)
resample('4min', base=15)
resample('5min', base=15)
resample('6min', base=15)
resample('7min', base=16)
resample('8min', base=19)
resample('9min', base=15)
resample('10min', base=15)
resample('11min', base=16)
resample('12min', base=15)
resample('13min', base=22)
resample('14min', base=23)
resample('15min', base=15)
resample('16min', base=27)
resample('17min', base=28)
resample('18min', base=33)
resample('19min', base=42)
resample('20min', base=15)
,1min
和3min
,不需要任何5min
即可进行以下操作:
15min
仍然试图理解base