我有一个Spark 2.2.0 DataFrame的货币价格,我在其中添加了回报。
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
val spark = SparkSession.builder.getOrCreate()
val prices = spark.read.json("prices.json")
// make a window function and convert prices to returns
val window = Window.partitionBy("currency").orderBy("time")
val lagPrice = lag(col("close"), 1).over(window)
val percentReturn = col("close") / col("lastClose") - 1d
val logReturn = log(col("close") / col("lastClose"))
val returns = prices.withColumn("lastClose", lagPrice)
.withColumn("return", percentReturn)
.withColumn("logReturn", logReturn)
现在我想使用窗口函数计算所有货币的滚动Covarance矩阵(如移动平均线)。但我找不到任何文档或示例。