这是一些可重现的代码。我想知道当功能是单热编码时,基于每个功能的SE计算是什么。如果我自己尝试:
看起来有些SE是1,我想这意味着重建是100%确定它是一回事,但它实际上是另一回事。对于分数误差,它们是否代表了从softmax分类器分配给该类别的概率的不同程度的错误?
library(h2o)
art <- data.frame(a = c("a","b","a","c","d","e","g","f","a"),
b = c("b","c","d","e","b","c","d","e","b"),
c = c(4,3,2,5,6,1,2,3,5))
dl = h2o.deeplearning(x = c("a","b","c"), training_frame = as.h2o(art),
autoencoder = TRUE,
reproducible = T,
seed = 1234,
hidden = c(1), epochs = 1)
sus.anon = h2o.anomaly(dl, as.h2o(art), per_feature=TRUE)
答案 0 :(得分:0)
我不知道h2o自动编码器,但在我看来,自动编码器无法正常使用单热编码变量。我尝试了一切。我没有尝试过的是使用Gumbel-Softmax估计器的分类自变量自动编码器&#39; (https://github.com/ericjang/gumbel-softmax)。