Question

我正在探索随机图形行为，并希望估计边缘分布。

l = []
k = 0
while k < 10000:
    g = nx.gnp_random_graph(n=n,p=p)
    l.append(len(g.edges()))
    k += 1

然后我绘制直方图np.histogram(l, bins=150)，它的分布似乎是二项式的但是，我想应用统计测试来估计这一点。

我尝试了scipy.stats.binom_test但是，它需要其他参数。在给定直方图的情况下，如何估算分布？

Answer 1

binom_test可用于检查单个Erdos-Renyi图是否具有合理的参数边数。但是你要问的是一系列结果，以及这些结果是否合适。

为此，您可以使用拟合优度检验，将经验分布与假设的零分布进行比较。存在各种GoF测试，但您需要一个用于离散数据的测试。 statsmodels实现了Χ²平方测试的拟合度(docs)。

我们需要一个零假设来使用这种类型的测试，这是一个二项分布：E-R图有n*(n-1)/2个可能的边，它们被添加独立概率p。很明显，这个分布是二项式的，所以你真正做的就是检查随机数生成器是否正常。

无论如何，我们的空模型是边的分布是〜Binom(n_edge, p)，边缘的预期数量为p * n*(n-1)/2。

以下是应用测试的代码。

import networkx as nx
from statsmodels.stats import gof
from scipy import stats

# generate some graphs and measure edge count
p = 0.1
n = 100
n_edge = n*(n-1)/2

l = []
for i in xrange(1000):
    g = nx.gnp_random_graph(n=n, p=p)
    l.append(len(g.edges()))

## we use a chi square test of goodness of fit for the measured edge counts
# chi square: null hypothesis is that data l comes from the binom distribution.
# so if pval is > alpha we do not reject the null.

alpha = 0.001 
chi2, pval, sig_test, msg = gof.gof_chisquare_discrete(stats.distributions.binom, (n_edge, p,), l, alpha, msg="Binom ")
print msg
print "\tpval: {:.3f}".format(pval)
print "\tgood fit to binom(N, p)? {}".format(pval > alpha)

这会产生如下输出：

 >>> chisquare - test for Binom at arg = (4950, 0.1) with pval = 0.8129512780114826 
 >>>    pval: 0.813
 >>>    good fit to binom(N, p)? True

所以边缘的分布确实是二项式。

随机图边缘二项分布的统计检验

1 个答案: