我使用svm
中的e1071
来获取这样的数据集:
sdewey <- svm(x = as.matrix(trainS),
y = trainingSmall$DEWEY,
type="C-classification")
这很好用,但是当我尝试像这样调整成本和伽玛时:
svm_tune <- tune(svm, train.x=as.matrix(trainS), train.y=trainingSmall$DEWEY, type="C-classification", ranges=list(cost=10^(-1:6), gamma=1^(-1:1)))
我收到此错误:
调谐错误(svm,train.x = as.matrix(trainS),train.y = trainingSmall $ DEWEY,:从属变量的类型错误!
我的训练数据的结构是这样的,但还有更多的线:
'data.frame': 1000 obs. of 1542 variables:
$ women.prisoners : int 1 0 0 0 0 0 0 0 0 0 ...
$ reformatories.for.women : int 1 0 0 0 0 0 0 0 0 0 ...
$ women : int 1 0 0 0 0 0 0 0 0 0 ...
$ criminal.justice : int 1 0 0 0 0 0 0 0 0 0 ...
$ soccer : int 0 1 0 0 0 0 0 0 0 0 ...
$ coal.mines.and.mining : int 0 0 1 0 0 0 0 0 0 0 ...
$ coal : int 0 0 1 0 0 0 0 0 0 0 ...
$ engineering.geology : int 0 0 1 0 0 0 0 0 0 0 ...
$ family.violence : int 0 0 0 1 0 0 0 0 0 0 ...
这是一个多类问题。 我不确定如何解决这个问题,或者是否有其他方法可以找到成本和伽玛参数的最佳值。
Here is an example of my data,而trainS
是没有前4列的数据(DEWEY,D1,D2和D3)
由于
答案 0 :(得分:1)
require(e1071)
trainingSmall<-read.csv("trainingSmallExtra.csv")
sdewey <- svm(x = as.matrix(trainingSmall[,4:nrow(trainingSmall)]),
y = trainingSmall$DEWEY,
type = "C-classification",
kernel = "linear" # same as no kernel
)
这是有效的,因为svm
已自动将DEWEY
转换为系数。
tune
模型失败,因为它是为用户自定义而设置的,它依赖于您提供正确的数据类型。由于DEWEY
是整数而不是factor
,因此失败了。我们可以解决这个问题:
trainingSmall$DEWEY <- as.factor(trainingSmall$DEWEY)
svm_tune <- tune(svm, train.x = as.matrix(trainingSmall[,4:nrow(trainingSmall)]),
train.y = trainingSmall$DEWEY, # the way I'm formatting your
kernel = "linear", # code is Google's R style
type = "C-classification",
ranges = list(
cost = 10^(-1:6),
gamma = 1^(-1:1)
)
)