如何解释一个热编码变量的Autoencoder anomoly SE?

时间:2017-06-08 16:54:06

标签: r h2o autoencoder

这是一些可重现的代码。我想知道当功能是单热编码时,基于每个功能的SE计算是什么。如果我自己尝试:

看起来有些SE是1,我想这意味着重建是100%确定它是一回事,但它实际上是另一回事。对于分数误差,它们是否代表了从softmax分类器分配给该类别的概率的不同程度的错误?

library(h2o)
art <- data.frame(a = c("a","b","a","c","d","e","g","f","a"),
              b = c("b","c","d","e","b","c","d","e","b"),
              c = c(4,3,2,5,6,1,2,3,5))

dl = h2o.deeplearning(x = c("a","b","c"), training_frame = as.h2o(art),
                      autoencoder = TRUE,
                      reproducible = T,
                      seed = 1234,
                      hidden = c(1), epochs = 1)
sus.anon = h2o.anomaly(dl, as.h2o(art), per_feature=TRUE)

1 个答案:

答案 0 :(得分:0)

我不知道h2o自动编码器,但在我看来,自动编码器无法正常使用单热编码变量。我尝试了一切。我没有尝试过的是使用Gumbel-Softmax估计器的分类自变量自动编码器&#39; (https://github.com/ericjang/gumbel-softmax)。