Question

我使用来自R中kernlab包的ksvm来预测概率，使用type="probabilities"中的predict.ksvm选项。但是，我发现有时使用predict(model,observation,type="r")不会产生predict(model,observation,type="p")给出概率最高的类。

示例：

> predict(model,observation,type="r")
[1] A
Levels: A B
> predict(model,observation,type="p")
        A    B
[1,] 0.21 0.79

这是正确的行为还是错误？如果它是正确的行为，我如何根据概率估计最可能的类？

尝试重现性的例子：

library(kernlab)
set.seed(1000)
# Generate fake data
n <- 1000
x <- rnorm(n)
p <- 1 / (1 + exp(-10*x))
y <- factor(rbinom(n, 1, p))
dat <- data.frame(x, y)
tmp <- split(dat, dat$y)
# Create unequal sizes in the groups (helps illustrate the problem)
newdat <- rbind(tmp[[1]][1:100,], tmp[[2]][1:10,])
# Fit the model using radial kernal (default)
out <- ksvm(y ~ x, data = newdat, prob.model = T)
# Create some testing points near the boundary

testdat <- data.frame(x = seq(.09, .12, .01))
# Get predictions using both methods
responsepreds <- predict(out, newdata = testdat, type = "r")
probpreds <- predict(out, testdat, type = "p")

results <- data.frame(x = testdat, 
                      response = responsepreds, 
                      P.x.0 = probpreds[,1], 
                      P.x.1 = probpreds[,2])

结果输出：

> results
     x response     P.x.0     P.x.1
1 0.09        0 0.7199018 0.2800982
2 0.10        0 0.6988079 0.3011921
3 0.11        1 0.6824685 0.3175315
4 0.12        1 0.6717304 0.3282696

Answer 1

如果你看一下决定矩阵和投票，它们似乎更符合回应：

> predict(out, newdata = testdat, type = "response")
[1] 0 0 1 1
Levels: 0 1
> predict(out, newdata = testdat, type = "decision")
            [,1]
[1,] -0.07077917
[2,] -0.01762016
[3,]  0.02210974
[4,]  0.04762563
> predict(out, newdata = testdat, type = "votes")
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    0    1    1
> predict(out, newdata = testdat, type = "prob")
             0         1
[1,] 0.7198132 0.2801868
[2,] 0.6987129 0.3012871
[3,] 0.6823679 0.3176321
[4,] 0.6716249 0.3283751

kernlab帮助页面（?predict.ksvm）链接到论文Probability estimates for Multi-class Classification by Pairwise Coupling by T.F. Wu, C.J. Lin, and R.C. Weng.

在第7.3节中，据说决定和概率可能不同：

...我们解释为什么结果基于概率和基于决策价值的方法可以如此明显。对于一些问题，由δDV选择的参数与通过δDV选择的参数完全不同其他五条规则。在波形中，在某些参数全部基于概率的方法提供了更高的交叉验证准确性比δDV。例如，我们观察验证的决策值对于两类数据，集合在[0.73,0.97]和[0.93,1.02]中; 因此，验证集中的所有数据都归类为一个类并且错误很高。相反，基于概率的方法通过sigmoid函数拟合决策值，这可以更好通过切割约0.95的决策值来分离这两个类。这一观察结果揭示了它们之间的区别基于概率和基于决策值的方法......

我不太熟悉这些方法来理解这个问题，但也许你这样做，看起来有不同的方法可以用概率和其他方法进行预测，type=response对应不同的方法而不是用于预测概率的那个。

为什么R中ksvm的概率和响应不一致？

1 个答案: