Question

我有一个网络，我使用igraph软件符合幂律：

plf = power.law.fit(degree_dist, impelementation = "plfit")

plf变量现在包含以下变量：

$continuous
[1] TRUE
$alpha
[1] 1.63975
$xmin
[1] 0.03
$logLik
[1] 4.037563
$KS.stat
[1] 0.1721117
$KS.p
[1] 0.9984284

igraph手册解释了这些变量：

xmin = the lower bound for fitting the power-law
alpha =  the exponent of the fitted power-law distribution
logLik =  the log-likelihood of the fitted parameters
KS.stat =  the test statistic of a Kolmogorov-Smirnov test that compares the fitted  distribution with the input vector. Smaller scores denote better fit
KS.p = the p-value of the Kolmogorov-Smirnov test. Small p-values (less than 0.05) indicate that the test rejected the hypothesis that the original data could have been drawn from the fitted power-law distribution

我想对这种幂律适合做一个“适合度”的测试。但我不知道该如何做到这一点，虽然我已经在网上论坛上发现了这个问题，但它通常仍然没有答案。

我认为这样做的一种方法是做一个chisq.test（x，y）。一个输入参数（比如x）将是degree_dist变量（观察到的网络的程度分布）。另一个输入参数（比如y）将是拟合幂律方程，它应该是P（k）= mk ^ a的形式。

我不确定这是否是一种合理的方法，如果是这样，我需要有关如何构建拟合幂律方程的建议。

如果有帮助，我网络的degree_dist是：

 0.00 0.73 0.11 0.05 0.02 0.02 0.03 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00        0.01 0.00 0.00 0.00 0.01

（这些是网络中出现0-21度的频率。（例如，73％的节点具有1级，1％的节点具有21度）。

的的 ** * ** * *** 编辑 ** * ** * ** * ****

我不确定使用degree_dist来计算plf是否是一个错误。如果是，我也使用网络中100个节点的度数运行相同的功能：

plf = power.law.fit(pure_deg, impelementation = "plfit")

其中，pure_deg是：

  21  7  5  6 17  3  6  6  2  5  4  3  7  4  3  2  2  2  2  3  2  3  2  2  2  2  2  1  1  1  1  1  1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 1

这导致输出：

$continuous
[1] FALSE
$alpha
[1] 2.362445
$xmin
[1] 1
$logLik
[1] -114.6303
$KS.stat
[1] 0.02293443
$KS.p
[1] 1

Answer 1

Colin Gillespie在R中有一个名为powerRlaw的软件包。这个包有很好的文档，包含很多使用每个函数的例子。非常直截了当。

http://cran.r-project.org/web/packages/poweRlaw/

例如在R中，如文档所述，以下代码从文件 full_path_of_file_name 获取数据并估算xmin和alpha并获得Clauset and al. (2009)提出的p值

library("poweRLaw")

words = read.table(<full_path_of_file_name>)
m_plwords = displ$new(words$V1)         # discrete power law fitting
est_plwords = estimate_xmin(m_plwords)  # get xmin and alpha

# here we have the goodness-of-fit test p-value
# as proposed by Clauset and al. (2009)
bs_p = bootstrap_p(m_plwords)

R中幂律分布的拟合优度检验

1 个答案: