我正在努力建立一个模型,其中Dirichlet变量的集中因子取决于另一个变量。
情况如下:
由于组件故障而导致系统故障(共有三个组件,每次测试/观察仅一个故障)。
组件故障的可能性取决于温度。
以下是对此情况的简短评论:
import numpy as np
import pymc as pm
# Temperature data : 3 cold temperatures and 3 warm temperatures
T_data = np.array([10, 12, 14, 80, 90, 95])
# Data of failures of 3 components : [0,0,1] means component 3 failed
F_data = np.array([[0, 0, 1], \
[0, 0, 1], \
[0, 0, 1], \
[1, 0, 0], \
[1, 0, 0], \
[1, 0, 0]])
n_component = 3
# When temperature is cold : Component 1 fails
# When temperature is warm : Component 3 fails
# Component 2 never fails
# Number of observations :
n_obs = len(F_data)
# The number of failures can be modeled as a Multinomial F ~ M(n_obs, p) with parameters
# - n_test : number of tests (Fixed)
# - p : probability of failure of each component (shape (n_obs, 3))
# The probability of failure of components follows a Dirichlet distribution p ~ Dir(alpha) with parameters:
# - alpha : concentration (shape (n_obs, 3))
# The Dirichlet distributions ensures the probabilities sum to 1
# The alpha parameters (and the the probability of failures) depend on the temperature alpha ~ a + b * T
# - a : bias term (shape (1,3))
# - b : describes temperature dependency of alpha (shape (1,3))
# The prior on "a" is a normal distributions with mean 1/2 and std 0.001
# a ~ N(1/2, 0.001)
# The prior on "b" is a normal distribution zith mean 0 and std 0.001
# b ~ N(0, 0.001)
# Coding it all with pymc
a = pm.Normal('a', 1/2, 1/(0.001**2), size = n_component)
b = pm.Normal('b', 0, 1/(0.001**2), size = n_component)
# I generate 3 alphas values (corresponding to the 3 components) for each of the 6 temperatures
@pm.deterministic
def alphas(t=T_data, a=a, b=b):
return a + np.outer(t, b)
# I think I should get 3 probabilities (corresponding to the 3 components) for each of the 6 temperatures
# Why does pm.Dirichlet return a (1,3) array instead of (6,3) ?
p = pm.Dirichlet('p', alphas)
# Multinomial is observed and take values from F_data
F = pm.Multinomial('F', 1, p, value = F_data, observed = True)
# Running MCMC
M = pm.MCMC([a, b, alphas, p, F])
M.sample(iter = 10000, burn = 1000)
我遇到的问题如下:
检查结果:
print('Shape of a :', a.value.shape)
print('Shape of b :',b.value.shape)
print('Shape of alpha :',alphas.value.shape)
print('Shape of p :',p.value.shape)
print(' ')
print('Value of b :',b.trace[:].mean(axis = 0))
# The model clearly doesn't find the temperature dependency of the failure probability
# I would expect for b something like
# [positive value, close to zero value, negative value]
a的形状:(3,)
b的形状:(3,)
alpha形状:(6,3)
p的形状:(2,)
b的值:[1.11563624e-05 -1.86813307e-05 9.28859752e-05]
有什么建议吗?