从图的度分布中抽样

时间:2014-04-25 19:55:50

标签: python numpy scipy

我有一个简单的,愚蠢的Python问题。给定一个图表,我试图从一个随机变量中进行采样,该随机变量的分布与图形的度分布相同。

这看起来应该非常简单。但不知何故,我仍然设法搞砸了。我的代码如下所示:

import numpy as np
import scipy as sp
import graph_tool.all as gt

G = gt.random_graph(500, deg_sampler=lambda: np.random.poisson(1), directed=False)
deg = gt.vertex_hist(G,"total",float_count=False)

# Extract counts and values
count = list(deg[0])
value = list(deg[1])

# Generate vector of probabilities for each node
p = [float(x)/sum(count) for x in count]

# Load into a random variable for sampling
x = sp.stats.rv_discrete(values=(value,p))
print x.rvs(1)

但是,在运行它时会返回错误:

Traceback (most recent call last):
  File "temp.py", line 16, in <module>
    x = sp.stats.rv_discrete(values=(value,p))
  File "/usr/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 5637, in __init__
    self.pk = take(ravel(self.pk),indx, 0)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 103, in take
    return take(indices, axis, out, mode)
IndexError: index out of range for array

我不确定为什么会这样。如果在上面的代码中我写了:

x = sp.stats.rv_discrete(values=(range(len(count)),p))

然后代码运行正常,但它给出了一个奇怪的结果 - 显然我已经指定了这个分布的方式,值为&#34; 0&#34;应该是最常见的。但是这段代码给出了#34; 1&#34;概率很高,永远不会返回&#34; 0,&#34;所以有些东西会以某种方式转移。

任何人都可以澄清这里发生了什么吗?任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:3)

我相信x.rvs()的第一个论点是loc arg。如果您通过致电loc=1拨打x.rvs(1),则需要向所有值添加1

相反,你想要

x.rvs(size=1)

顺便说一句,我建议您替换它:

# Extract counts and values
count = list(deg[0])
value = list(deg[1])

# Generate vector of probabilities for each node
p = [float(x)/sum(count) for x in count]

使用:

count, value = deg       # automatically unpacks along first axis
p = count.astype(float) / count.sum()  # count is an array, so you can divide all elements at once