pandas resample .csv将数据打勾到OHLC

时间:2014-12-12 20:27:32

标签: python pandas

我有一个.csv文件的财务刻度数据,其中3列对应于日期,时间和时间。价钱。这些文件没有标题。

01/18 / 14,104:09:28,55.0 01/18 / 14,102:18:31,55.4
2014年1月17日,10:42:34,55.3 2014年1月17日,03:18:07,55.2 ...

我想使用pandas重新采样到Daily OHLC中,这样我就可以以正确的格式将其导入我的图表软件。

我只使用以下方式打开文件:

data = pd.read_csv('data.csv')

你能帮我用pandas resample将我的fomat中的数据转换成OHLC吗? 感谢

2 个答案:

答案 0 :(得分:1)

如果它仍然是实际的,那么在熊猫中有最简单的方法:

data.resample('1D').apply('ohlc')

答案 1 :(得分:0)

使用Python,但没有熊猫:

#!/usr/bin/env python

import datetime
from decimal import Decimal

class Tick(object):
    pass    

ticks = []
with open('data.csv') as f:
    ticksTemp = []
    lines = [x.strip('\n') for x in f.readlines()]    

    for line in lines:
        columns = [x.strip() for x in line.split(',')]
        if len(columns) != 3:
            continue;
        timeStr = columns[0] + '/' + columns[1]
        time  = datetime.datetime.strptime(timeStr, "%m/%d/%y/%H:%M:%S" )        
        price = columns[2]
        tick = Tick()
        tick.time = time
        tick.price = Decimal(price)
        ticksTemp.append(tick)
    ticks = sorted(ticksTemp, key = lambda x: x.time, reverse=False)


lines = []
first = ticks[0]
last = ticks[-1]
time = first.time
o,h,l,c = first.price, first.price, first.price, first.price
def appendLine():
    lines.append(time.strftime('%Y-%m-%d')+','+str(o)+ ','+str(h)+','+str(l)+','+str(c))
for tick in ticks:    
    if(tick.time.year != time.year or tick.time.day != time.day):
        appendLine()
        time = tick.time
        o = tick.price
    c = tick.price
    if tick.price > h:
        h = tick.price
    if tick.price < l:
        l = tick.price
if last != first:
    appendLine()
with open('ohlc.csv', 'w') as f:
    f.write('\n'.join(lines))

data.csv:

01/18/14, 04:09:28, 55.0
01/18/14, 02:18:31, 55.4
01/17/14, 10:42:34, 55.3
01/17/14, 03:18:07, 55.2

ohlc.csv:

2014-01-17,55.2,55.3,55.2,55.3
2014-01-18,55.4,55.4,55.0,55.0