我有一个.csv文件的财务刻度数据,其中3列对应于日期,时间和时间。价钱。这些文件没有标题。
即
01/18 / 14,104:09:28,55.0
01/18 / 14,102:18:31,55.4
2014年1月17日,10:42:34,55.3
2014年1月17日,03:18:07,55.2
...
我想使用pandas重新采样到Daily OHLC中,这样我就可以以正确的格式将其导入我的图表软件。
我只使用以下方式打开文件:
data = pd.read_csv('data.csv')
你能帮我用pandas resample将我的fomat中的数据转换成OHLC吗? 感谢
答案 0 :(得分:1)
如果它仍然是实际的,那么在熊猫中有最简单的方法:
data.resample('1D').apply('ohlc')
答案 1 :(得分:0)
使用Python,但没有熊猫:
#!/usr/bin/env python
import datetime
from decimal import Decimal
class Tick(object):
pass
ticks = []
with open('data.csv') as f:
ticksTemp = []
lines = [x.strip('\n') for x in f.readlines()]
for line in lines:
columns = [x.strip() for x in line.split(',')]
if len(columns) != 3:
continue;
timeStr = columns[0] + '/' + columns[1]
time = datetime.datetime.strptime(timeStr, "%m/%d/%y/%H:%M:%S" )
price = columns[2]
tick = Tick()
tick.time = time
tick.price = Decimal(price)
ticksTemp.append(tick)
ticks = sorted(ticksTemp, key = lambda x: x.time, reverse=False)
lines = []
first = ticks[0]
last = ticks[-1]
time = first.time
o,h,l,c = first.price, first.price, first.price, first.price
def appendLine():
lines.append(time.strftime('%Y-%m-%d')+','+str(o)+ ','+str(h)+','+str(l)+','+str(c))
for tick in ticks:
if(tick.time.year != time.year or tick.time.day != time.day):
appendLine()
time = tick.time
o = tick.price
c = tick.price
if tick.price > h:
h = tick.price
if tick.price < l:
l = tick.price
if last != first:
appendLine()
with open('ohlc.csv', 'w') as f:
f.write('\n'.join(lines))
data.csv:
01/18/14, 04:09:28, 55.0
01/18/14, 02:18:31, 55.4
01/17/14, 10:42:34, 55.3
01/17/14, 03:18:07, 55.2
ohlc.csv:
2014-01-17,55.2,55.3,55.2,55.3
2014-01-18,55.4,55.4,55.0,55.0