data.table

时间:2015-11-19 16:00:35

标签: r data.table kaggle

我不熟悉这个df[, .(...), Col]符号。如果我遗漏了一些明显的东西,我很抱歉,但我找不到这种符号样式的引用,虽然它看起来非常有用。

它似乎正在实施聚合。根据下面代码中这种符号的位置,我希望它来自R而不是来自h2o,但我试过检查两者都无济于事。

这个例子来自Kaggle比赛并且代码有效(重现它go here):

trainHex<-as.h2o(train[,.(
  dist   = mean(radardist_km, na.rm = T),
  refArea5   = mean(Ref_5x5_50th, na.rm = T),
  refArea9  = mean(Ref_5x5_90th, na.rm = T),
  meanRefcomp = mean(RefComposite,na.rm=T),
  meanRefcomp5 = mean(RefComposite_5x5_50th,na.rm=T),
  meanRefcomp9 = mean(RefComposite_5x5_90th,na.rm=T),
  zdr   = mean(Zdr, na.rm = T),
  zdr5   = mean(Zdr_5x5_50th, na.rm = T),
  zdr9   = mean(Zdr_5x5_90th, na.rm = T),
  target = log1p(mean(Expected)),
  meanRef = mean(Ref,na.rm=T),
  sumRef = sum(Ref,na.rm=T),
  records = .N,
  naCounts = sum(is.na(Ref))
),Id][records>naCounts,],destination_frame="train.hex")

我很喜欢文档和/或对此的一个很好的解释。

1 个答案:

答案 0 :(得分:5)

.()是一个 data.table 便利函数,充当list()的简洁别名。让事情变得复杂一点(主要是那些像你一样,试图找出.做什么!)的事实是它只在{{1}的调用范围内被解释为这样。 }。

此处,来自[.data.table()

?data.table