Question

我想使用自定义效果指标来使用caret.来训练模型使用明文档here，我可以创建新的效果指标。但是，我想将每个预测的其他信息传递给下面的performance.metric函数。我看到data包含pred和obs的列，分别是预测数据和观察数据。我还看到可以添加weights和classProbs，因为文档明确指出。是否可以为每个预测传递额外信息？

具体来说，我想使用模型产生的预测序列中的美元回报来评估资产预测算法的性能。我的预测（数据$ pred）是资产的每日变化。为了获得每天的返还金额，我需要传递资产的每日变化。我无法弄清楚如何传递assetChange对象的信息。

以下是效果指标：

performance.metric = function(data, lev= NULL, model = NULL,
                              investment = 20000){

  if (!all(levels(data[, "pred"]) == levels(data[, "obs"]))) 
    stop("levels of observed and predicted data do not match")

  #custom performance metric
  assetChange = #this should be a vector of length nrows(data)
    #with the percentage change for the asset each day

  percReturn = ifelse(data[,"obs"] == data[, "pred"], abs(assetChange), -abs(assetChange) )
  #the strategy involves buying when predicting to increase and selling when predicted to decrease
  #so when the prediction is right, it gets the abs of the percent change and else loses that amount

  dollarReturn = rep(0, nrow(data))
  dollarReturn[1] = investment*percReturn[1]
  for (i in 2:length(dollarReturn)){
    dollarReturn[i] = dollarReturn[i-1]*percReturn[i]
  }

  out <- c(dollarReturn)
  names(out) <- c("dollarReturn")
  out
}

我可以想象通过weights中的data列传递信息的（hackish）方式，但更一般地说，是否可以从data对象添加列在performance.metric之外，以便此函数具有必要的数据？

Answer 1

不，我没有看到这样做的简单方法。我想避免重构函数来传递辅助信息，因为这可能会破坏向后兼容性。

但是，我添加了可能有用的an issue to github。暴露于计算性能的函数的数据暴露了观察结果，预测值和类概率的列。我还将传入行索引，以便可以将保持值绑定到原始数据，并让词法作用域执行其余操作。

如果您创建了一个合成示例并将其添加到问题中以便我们可以测试解决方案，那将会有所帮助。

最高

将附加信息传递给R和插入符中的替代性能指标

1 个答案: