我有一组随机生成的正式图,我想计算每个图的熵。同样的问题用不同的词语:我有几个网络,并且想要计算每个网络的信息内容。
以下是包含图形熵的正式定义的两个来源:
http://www.cs.washington.edu/homes/anuprao/pubs/CSE533Autumn2010/lecture4.pdf(PDF)
http://arxiv.org/abs/0711.4175v1
我正在寻找的代码将图形作为输入(作为边缘列表或邻接矩阵)并输出多个位或一些其他信息内容度量。
因为我无法在任何地方找到这个实现,所以我打算根据正式定义从头开始编写代码。如果有人已经解决了这个问题,并愿意分享这些代码,那将非常感激。
答案 0 :(得分:5)
我最终使用不同的论文来定义图熵:
复杂网络信息论:进化与建筑约束
R.V. Sole和S. Valverde(2004)
和
基于拓扑结构的网络熵及其对随机网络的计算
B.H.王,W.X。王和周。
计算每个的代码如下。该代码假设您有一个没有自我循环的无向,未加权的图形。它将邻接矩阵作为输入,并以比特的形式返回熵量。它在R中实现并使用sna package。
graphEntropy <- function(adj, type="SoleValverde") {
if (type == "SoleValverde") {
return(graphEntropySoleValverde(adj))
}
else {
return(graphEntropyWang(adj))
}
}
graphEntropySoleValverde <- function(adj) {
# Calculate Sole & Valverde, 2004 graph entropy
# Uses Equations 1 and 4
# First we need the denominator of q(k)
# To get it we need the probability of each degree
# First get the number of nodes with each degree
existingDegrees = degree(adj)/2
maxDegree = nrow(adj) - 1
allDegrees = 0:maxDegree
degreeDist = matrix(0, 3, length(allDegrees)+1) # Need an extra zero prob degree for later calculations
degreeDist[1,] = 0:(maxDegree+1)
for(aDegree in allDegrees) {
degreeDist[2,aDegree+1] = sum(existingDegrees == aDegree)
}
# Calculate probability of each degree
for(aDegree in allDegrees) {
degreeDist[3,aDegree+1] = degreeDist[2,aDegree+1]/sum(degreeDist[2,])
}
# Sum of all degrees mult by their probability
sumkPk = 0
for(aDegree in allDegrees) {
sumkPk = sumkPk + degreeDist[2,aDegree+1] * degreeDist[3,aDegree+1]
}
# Equivalent is sum(degreeDist[2,] * degreeDist[3,])
# Now we have all the pieces we need to calculate graph entropy
graphEntropy = 0
for(aDegree in 1:maxDegree) {
q.of.k = ((aDegree + 1)*degreeDist[3,aDegree+2])/sumkPk
# 0 log2(0) is defined as zero
if (q.of.k != 0) {
graphEntropy = graphEntropy + -1 * q.of.k * log2(q.of.k)
}
}
return(graphEntropy)
}
graphEntropyWang <- function(adj) {
# Calculate Wang, 2008 graph entropy
# Uses Equation 14
# bigN is simply the number of nodes
# littleP is the link probability. That is the same as graph density calculated by sna with gden().
bigN = nrow(adj)
littleP = gden(adj)
graphEntropy = 0
if (littleP != 1 && littleP != 0) {
graphEntropy = -1 * .5 * bigN * (bigN - 1) * (littleP * log2(littleP) + (1-littleP) * log2(1-littleP))
}
return(graphEntropy)
}
答案 1 :(得分:1)
如果您有一个加权图表,一个好的开始就是对所有权重进行排序和计数。然后,您可以使用公式-log(p)+ log(2)(http://en.wikipedia.org/wiki/Binary_entropy_function)来确定代码所需的位数。也许这不起作用,因为它是二元熵函数?
答案 2 :(得分:0)
您可以使用Koerner's entropy(= Shannon熵应用于图表)。文献的一个很好的参考是here。但请注意,计算通常是NP-hard(因为你需要搜索所有顶点子集的愚蠢原因)。