我对朱莉娅来说很新,但是我试了一下,因为基准测试声称它比Python要快得多。
我正试图以[“unixtime”,“price”,“amount”]格式使用一些股票价格数据。
我设法加载数据并将unixtime转换为Julia中的日期,但现在我需要重新采样数据以使用olhc(开放,高,低,收盘)作为价格和金额的总和,朱莉娅的特定时期(每小时,15分钟,5分钟等):
julia> head(btc_raw_data)
6x3 DataFrame:
date price amount
[1,] 2011-09-13T13:53:36 UTC 5.8 1.0
[2,] 2011-09-13T13:53:44 UTC 5.83 3.0
[3,] 2011-09-13T13:53:49 UTC 5.9 1.0
[4,] 2011-09-13T13:53:54 UTC 6.0 20.0
[5,] 2011-09-13T14:32:53 UTC 5.95 12.4521
[6,] 2011-09-13T14:35:04 UTC 5.88 7.458
我看到有一个名为Resampling的软件包,但它似乎只接受一个时间段,只是我想要输出数据的行数。
还有其他选择吗?
答案 0 :(得分:1)
您可以使用https://github.com/femtotrader/TimeSeriesIO.jl
将DataFrame(从DataFrames.jl)转换为TimeArray(来自TimeSeries.jl)using TimeSeriesIO: TimeArray
ta = TimeArray(df, colnames=[:price], timestamp=:date)
您可以使用TimeSeriesResampler https://github.com/femtotrader/TimeSeriesResampler.jl重新取样时间序列(来自TimeSeries.jl的TimeArray) 和TimeFrames https://github.com/femtotrader/TimeFrames.jl
using TimeSeriesResampler: resample, mean, ohlc, sum, TimeFrame
# Define a sample timeseries (prices for example)
idx = DateTime(2010,1,1):Dates.Minute(1):DateTime(2011,1,1)
idx = idx[1:end-1]
N = length(idx)
y = rand(-1.0:0.01:1.0, N)
y = 1000 + cumsum(y)
#df = DataFrame(Date=idx, y=y)
ta = TimeArray(collect(idx), y, ["y"])
println("ta=")
println(ta)
# Define how datetime should be grouped (timeframe)
tf = TimeFrame(dt -> floor(dt, Dates.Minute(15)))
# resample using OHLC values
ta_ohlc = ohlc(resample(ta, tf))
println("ta_ohlc=")
println(ta_ohlc)
# resample using mean values
ta_mean = mean(resample(ta, tf))
println("ta_mean=")
println(ta_mean)
# Define an other sample timeseries (volume for example)
vol = rand(0:0.01:1.0, N)
ta_vol = TimeArray(collect(idx), vol, ["vol"])
println("ta_vol=")
println(ta_vol)
# resample using sum values
ta_vol_sum = sum(resample(ta_vol, tf))
println("ta_vol_sum=")
println(ta_vol_sum)
你应该得到:
julia> ta
525600x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:59:00
y
2010-01-01T00:00:00 | 1000.16
2010-01-01T00:01:00 | 1000.1
2010-01-01T00:02:00 | 1000.98
2010-01-01T00:03:00 | 1001.38
⋮
2010-12-31T23:56:00 | 972.3
2010-12-31T23:57:00 | 972.85
2010-12-31T23:58:00 | 973.74
2010-12-31T23:59:00 | 972.8
julia> ta_ohlc
35040x4 TimeSeries.TimeArray{Float64,2,DateTime,Array{Float64,2}} 2010-01-01T00:00:00 to 2010-12-31T23:45:00
Open High Low Close
2010-01-01T00:00:00 | 1000.16 1002.5 1000.1 1001.54
2010-01-01T00:15:00 | 1001.57 1002.64 999.38 999.38
2010-01-01T00:30:00 | 999.13 1000.91 998.91 1000.91
2010-01-01T00:45:00 | 1001.0 1006.42 1001.0 1006.42
⋮
2010-12-31T23:00:00 | 980.84 981.56 976.53 976.53
2010-12-31T23:15:00 | 975.74 977.46 974.71 975.31
2010-12-31T23:30:00 | 974.72 974.9 971.73 972.07
2010-12-31T23:45:00 | 972.33 973.74 971.49 972.8
julia> ta_mean
35040x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:45:00
y
2010-01-01T00:00:00 | 1001.1047
2010-01-01T00:15:00 | 1001.686
2010-01-01T00:30:00 | 999.628
2010-01-01T00:45:00 | 1003.5267
⋮
2010-12-31T23:00:00 | 979.1773
2010-12-31T23:15:00 | 975.746
2010-12-31T23:30:00 | 973.482
2010-12-31T23:45:00 | 972.3427
julia> ta_vol
525600x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:59:00
vol
2010-01-01T00:00:00 | 0.37
2010-01-01T00:01:00 | 0.67
2010-01-01T00:02:00 | 0.29
2010-01-01T00:03:00 | 0.28
⋮
2010-12-31T23:56:00 | 0.74
2010-12-31T23:57:00 | 0.66
2010-12-31T23:58:00 | 0.22
2010-12-31T23:59:00 | 0.47
julia> ta_vol_sum
35040x1 TimeSeries.TimeArray{Float64,1,DateTime,Array{Float64,1}} 2010-01-01T00:00:00 to 2010-12-31T23:45:00
vol
2010-01-01T00:00:00 | 7.13
2010-01-01T00:15:00 | 6.99
2010-01-01T00:30:00 | 8.73
2010-01-01T00:45:00 | 8.27
⋮
2010-12-31T23:00:00 | 6.11
2010-12-31T23:15:00 | 7.49
2010-12-31T23:30:00 | 5.75
2010-12-31T23:45:00 | 8.36