我在R中有一个bnlearn
模型,使用gs
函数学习,有4个分类变量和8个数值变量。
当我尝试使用测试集验证我的模型时,在尝试预测某些节点时出现此错误:
check.fit.vs.data中的错误(fits = object,data = data,subset = object [[node]] $ parents):
'关键字'在节点和数据中具有不同的级别数。
是否无法同时使用bnlearn
的数字和分类变量?如果有可能,我做错了什么?
mydata$A <- as.factor(mydata$A)
mydata$B <- as.numeric(mydata$B)
mydata$C <- as.numeric(mydata$C)
mydata$D <- as.numeric(mydata$D)
mydata$E <- as.factor(mydata$E)
mydata$F <- as.numeric(mydata$F)
mydata$G <- as.numeric(mydata$G)
mydata$H <- as.numeric(mydata$H)
mydata$I <- as.numeric(mydata$I)
mydata$J <- as.numeric(mydata$J)
mydata$K <- as.numeric(mydata$K)
mydata$L <- as.numeric(mydata$L)
mydata$M <- as.numeric(mydata$M)
mydata$N <- as.numeric(mydata$N)
mydata$O <- as.numeric(mydata$O)
mydata$P <- as.numeric(mydata$P)
mydata$Q <- as.numeric(mydata$Q)
#create vector of black arcs
temp1=vector(mode = "character", length = 0)
for (i in 1:length(varnames)){
for (j in 1:length(varnames)){
temp1 <- c(temp1,varnames[i])
}
}
temp2=vector(mode = "character", length = 0)
for (i in 1:length(varnames)){
temp2 <- c(temp2,varnames)
}
#creat to arcs of the model
arcdata = read.csv("C:/users/asaf/desktop/in progress/whitearcs.csv", header = T)
wfrom=arcdata[,1]
wto=arcdata[,2]
whitelist = data.frame(from = wfrom,to =wto)
#block unwanted arcs
blacklist = data.frame(from = temp1, to = temp2)
#fit and plot the model
#gaussian method
model = gs(mydata, whitelist = whitelist, blacklist = blacklist)
#inference procedure
learntmodel = bn.fit(model,mydata,method = "mle",debug = F)
graphviz.plot(learntmodel)
myvalidation=read.csv("C:/users/asaf/desktop/in progress/val.csv", header = T)
#predicate A
pred = predict(learntmodel, node="A", myvalidation)
myvalidation$A <- pred
#predicate B
pred = predict(learntmodel, node="B", myvalidation)
myvalidation$B <- pred
此时它会抛出以下错误:
check.fit.vs.data中的错误(fits = object,data = data,subset = object [[node]] $ parents):
&#39; A&#39;在节点和数据中具有不同的级别数。
答案 0 :(得分:0)
bnlearn
无法同时使用混合变量(定性和定量),我可以在deal
包中阅读。
另一种可能性是使用discretize
将连续变量转换为离散变量:
dmydata <- discretize(mydata, breaks = 2, method = "interval")
model <- gs(dmydata, whitelist = whitelist, blacklist = blacklist)
...并继续使用您的代码。
答案 1 :(得分:0)
实际上我今天遇到了同样的问题,我通过确保连接到相关节点的其他节点(即$ A)也具有相同的级别来解决它。