无法判断我是否在使用pymc的错误功能做错了,或者这是一个错误。通过蒙版数组的Impute将1e20值传递给缺少的元素,而效率低下的方法Impute似乎传回正确的样本。以下是一个小例子。
import numpy as np
import pymc as py
disasters_array = np.random.random((3,3))
disasters_array[1,1]=None
# The inefficient way, using the Impute function:
D = py.Impute('D', py.Normal, disasters_array, mu=.5, tau=1E5)
# The efficient way, using masked arrays:
# Generate masked array. Where the mask is true,
# the value is taken as missing.
print disasters_array
masked_values = np.ma.masked_invalid(disasters_array)
# Pass masked array to data stochastic, and it does the right thing
disasters = py.Normal('disasters', mu=.5, tau=1E5, value=masked_values, observed=True)
@py.deterministic
def test(disasters=disasters, D=D):
print D
print disasters
mcmc = py.MCMC(py.Model(set([test,disasters])))
输出:
原始矩阵:
[[ 0.23507836 0.2024624 0.90518228]
[ 0.95816 **nan** 0.43145808]
[ 0.99566308 0.25431568 0.25464137]]
D带有插补:
[[array(0.23507836309832741) array(0.20246240248367342)
array(0.9051822818081371)]
[array(0.9581599997650212) **array(0.5005324083232756)**
array(0.43145807852698237)]
[array(0.9956630757864052) array(0.2543156788973996)
array(0.25464136701826867)]]
Masked Array方法:
[[ 2.35078363e-01 2.02462402e-01 9.05182282e-01]
[ 9.58160000e-01 **1.00000000e+20** 4.31458079e-01]
[ 9.95663076e-01 2.54315679e-01 2.54641367e-01]]