Deedle - 过滤FilterRowValues后的加权平均值

时间:2018-02-06 18:54:48

标签: f# deedle

我是F#的新手。我试图通过两个时间戳和一个instrument_id过滤我的帧后计算加权平均值。

示例数据:

| trade_qty | trade_price | trade_timestamp    | instrument_id 
|  1000     |  100.59     | 1/26/2018 16:00:00 |  1 
|  2000     |  105.10     | 1/26/2018 15:59:30 |  1 
|  3000     |  97.59      | 1/26/2018 15:59:00 |  1 

我发现我可以轻松过滤:例如仪器1两次之间

frameVolume
|> Frame.filterRowValues (fun c.GetAs<DateTime>
   ("trade_timestamp)>DateTime(2018,1,27,15,31,0))
|> Frame.filterRowValues (fun c.GetAs<DateTime>
    ("trade_timestamp)<DateTime(2018,1,27,16,00,0))
|> Frame.filterRowValues (fun c.GetAs<int>("instrument_id")=
    1

我被困在这里。我还没弄明白如何1 / sum(trade_qty)* Sum(trade_price * trade_qty)

我试过了:

|>Frame.GetColumn<float>("trade_qty") * 
    Frame.GetColumn<float>("trade_price")

对于上下文,我想将此作为函数输入另一个函数,以便计算几个区间的加权平均价格。

有什么想法?谢谢!

1 个答案:

答案 0 :(得分:4)

Deedle提供类似于F#List,Arrays和Seqs的内置高阶函数的高阶函数,这很好。使用这些知识,它使任务更简单。以下是您要描述的函数的实现:

#I "..\packages\Deedle.1.2.5"
#load "Deedle.fsx"

open System
open Deedle

let weightedAverage after before frame: float =
    let filteredFrame =
        frame
        |> Frame.filterRowValues (fun r -> r.GetAs<DateTime>("trade_timestamp") < before)
        |> Frame.filterRowValues (fun r -> r.GetAs<DateTime>("trade_timestamp") > after)
        |> Frame.filterRowValues (fun r -> r.GetAs<int>("instrument_id") = 1)
    let quantities: Series<int, float> = filteredFrame |> Frame.getCol "trade_qty"
    let tradePrices: Series<int, float> = filteredFrame |> Frame.getCol "trade_price"
    let weightedSum = 
        (quantities, tradePrices) 
        ||> Series.zip 
        |> Series.mapValues (fun (q, p) -> (OptionalValue.get q * OptionalValue.get p)) 
        |> Series.reduceValues (fun acc curr -> acc + curr)
    let total = 
        quantities 
        |> Series.reduceValues (fun acc curr -> acc + curr) 
    weightedSum / total 

let path = __SOURCE_DIRECTORY__ + "\data.csv"
let df = Frame.ReadCsv(path, separators = "|")
let ans = df |> weightedAverage (DateTime(2017, 1, 1)) (DateTime(2019, 1, 1))