如何在给定特定输入的情况下实现贝叶斯网络的枚举算法?

时间:2016-10-20 05:53:28

标签: python algorithm bayesian-networks

我刚开始学习贝叶斯网络,我一直在尝试在python中实现一个。

根据文件的特定输入(包含网络节点和每个节点的概率分布表),执行以字符串形式给出的查询,应用枚举算法并输出结果查询。格式如下:

[Nodes]
Burglary, Earthquake, Alarm, JohnCalls, MaryCalls

[Probabilities]
+Burglary = 0.001
+Earthquake = 0.002
+Alarm|+Earthquake,+Burglary = 0.95
+Alarm|-Earthquake, +Burglary = 0.94
+Alarm|+Earthquake, -Burglary = 0.29
+Alarm|-Earthquake, -Burglary = 0.001
+JohnCalls|+Alarm = 0.9
+JohnCalls|-Alarm = 0.05
+MaryCalls|+Alarm = 0.7
+MaryCalls|-Alarm = 0.01

[Queries]
+Burglary|+Earthquake, +JohnCalls
+Earthquake
-MaryCalls|+Earthquake, +Alarm
+JohnCalls|-Earthquake, -MaryCalls, +Burglary
-Alarm|-Earthquake, -MaryCalls, +Burglary
-Alarm, +JohnCalls|-Earthquake, -MaryCalls, +Burglary

到目前为止,我已经能够解析这个文本文件并构建:

  • 所有名为network
  • 的节点的列表
  • 包含所有查询字符串的列表
  • 包含标题,父节点列表和概率分布表的字典的节点类

问题是,我不知道如何从这里继续。我知道算法是如何进行的,因此我很难将其映射到代码中。

算法的工作方式如下: https://www.youtube.com/watch?v=q5DHnmHtVmc

在这个算法的python中可能有什么可能的实现?我在哪里可以找到这个算法的一个好的python示例?我在google上进行了搜索,但是我发现大多数示例和算法都很复杂,对于这个主题的新手来说过于复杂。

这是我到目前为止的代码:

import copy
import fileinput

network = [] #Network is a list of all the nodes
queries = [] #List of all the required queries
allProbs = {} #List with all the propabilities, given or calculated

class Node:
    def __init__(self, title, parents, pdt):
        self.title = title #String that identifies the node
        self.parents = parents #List of nodes that are parents of this node
        self.pdt = pdt #Dictionary where { '+title': value, '-title': value }

def readInput():
    input = fileinput.input()
    operation = 0
    for line in input:
        line = line.rstrip('\n').rstrip('\r')
        #Parsing the line that contains the nodes
        if operation is 1 and "[Probabilities]" not in line and "[Queries]" not in line and line is not "" and line is not "\n":
            nodes = line.split(',')
            for element in nodes:
                node = Node(element.replace(" ", ""), [], {})
                network.append(node);
        #Parsing the lines that contain the propabilities
        if operation is 2 and "[Probabilties]" not in line and "[Queries]" not in line and line is not "" and line is not "\n":
            lineAux = line.replace(" ", "").split("=")
            nodes = lineAux[0].split("|")
            queryNode = nodes[0].strip("+")
            #Finding and assigning parent nodes to each node, based on the evidence nodes of the probabilities
            if len(nodes) > 1:
                evidenceNodes = nodes[1].split(",")
                for element in evidenceNodes:
                    rawElement = element.strip("+").strip("-")
                    for node in network:
                        if node.title == queryNode:
                            alreadyInList = False
                            if len(node.parents) > 0:
                                for parent in node.parents:
                                    if parent.title == rawElement:
                                        alreadyInList = True
                                if not alreadyInList:
                                    for parentNode in network:
                                        if parentNode.title == rawElement:
                                            node.parents.append(parentNode)
                            else:
                                for parentNode in network:
                                    if parentNode.title == rawElement:
                                        node.parents.append(parentNode)
            #Assigning the Probability to the correct node
            for node in network:
                if node.title == queryNode:
                    node.pdt[lineAux[0]] = float(lineAux[1])
                    node.pdt["-" + lineAux[0].strip("+")] = round(1.0 - float(lineAux[1]), 7)
        #Handling what to do based on the value of the line
        if "[Nodes]" in line:
            operation = 1
        if "[Probabilities]" in line:
            operation = 2
        if "[Queries]" in line:
            operation = 3

readInput()
for node in network:
    print("------------------------")
    print(node.title)
    print(node.pdt)

我知道这可能不是最有效的代码,但就目前而言,我只想找到一个有效的解决方案,然后再进行优化。

提前致谢!

0 个答案:

没有答案