我有一个非常快速的问题,一个简单的可重复的例子,与我的bnlearn预测工作有关
library(bnlearn)
Learning.set4=cbind(c("Yes","Yes","Yes","No","No","No"),c(9,10,8,3,2,1))
Learning.set4=as.data.frame(Learning.set4)
Learning.set4[,c(2)]=as.numeric(as.character(Learning.set4[,c(2)]))
colnames(Learning.set4)=c("Cause","Cons")
b.network=empty.graph(colnames(Learning.set4))
struct.mat=matrix(0,2,2)
colnames(struct.mat)=colnames(Learning.set4)
rownames(struct.mat)=colnames(struct.mat)
struct.mat[1,2]=1
bnlearn::amat(b.network)=struct.mat
haha=bn.fit(b.network,Learning.set4)
#Some predictions with "lw" method
#Here is the approach I know with a SET particular modality.
#(So it's happening with certainty, here for example I know Cause is "Yes")
classic_prediction=cpdist(haha,nodes="Cons",evidence=list("Cause"="Yes"),method="lw")
print(mean(classic_prediction[,c(1)]))
#What if I wanted to predict the value of Cons, when Cause has a 60% chance of being Yes and 40% of being no?
#I decided to do this, according the help
#I could also make a function that generates "Yes" or "No" with proper probabilities.
prediction_idea=cpdist(haha,nodes="Cons",evidence=list("Cause"=c("Yes","Yes","Yes","No","No")),method="lw")
print(mean(prediction_idea[,c(1)]))
以下是帮助说明的内容:
"在离散或序数节点的情况下,也可以提供两个或更多个值。在这种情况下,将从指定值集合中以均匀概率对该节点的值进行采样"
当我使用分类变量预测变量的值时,我现在只使用了所述变量的某种形态,如示例中的第一个预测。 (将证据设置为"是"让Cons获得高价值)
但是如果我想在不知道变量的确切模态的情况下预测Cons,可以肯定地说,我可以使用我在第二次预测中所做的事情(只知道概率)吗? 这是一种优雅的方式还是有更好的实施方式我不知道?
答案 0 :(得分:2)
我与该软件包的创建者联系,我将在此处粘贴与该问题相关的答案:
对cpquery()的调用是错误的:
Prediction_idea=cpdist(haha,nodes="Cons",evidence=list("Cause"=c("Yes","Yes","Yes","No","No")),method="lw")
print(mean(prediction_idea[,c(1)]))
具有40%-60%软证据的查询要求您首先将这些新概率放在网络中
haha$Cause = c(0.40, 0.60)
然后在没有证据参数的情况下运行查询。 (因为你没有任何确凿的证据,实际上,只是原因的概率分布不同。)
我将发布代码,让我从示例中做出我想要的拟合网络。
change=haha$Cause$prob
change[1]=0.4
change[2]=0.6
haha$Cause=change
new_prediction=cpdist(haha,nodes="Cons",evidence=TRUE,method="lw")
print(mean(new_prediction[,c(1)]))