我想用Python估计一个项目反应理论(IRT)模型。更具体地说,参考学生参加考试的典型IRT示例。对于每个学生,我们观察他们是否对他们在考试中回答的问题给出了正确的答案。这给了我们一个观察结果矩阵X,从中我们想要估计每个问题(1)难度参数α和(2)辨别参数β,这样我们也可以估计每个学生潜在能力Y作为他们是否的函数在每个测试问题上得到正确的答案,即α+βX。我可以找到如何在Python中使用MCMC估计这种类型的IRT贝叶斯模型的最佳示例是example。从这个例子中我不明白的是,学生是否在测试问题上得到正确答案的X矩阵进入模型。以下是此代码的略微修改版本,旨在评估每个学生的潜在能力:
#from pylab import * #Pylab will not install with pip so I just loaded numpy itself
from numpy import *
import numpy
from pymc import *
from pymc.Matplot import plot as mplot
numquestions = 300 # number of test items being simulated
numpeople = 10 # number of participants
numthetas = 1 # number of latent proficiency variables
generating = 0
theta_initial = zeros((numthetas, numpeople))
correctness = np.random.randint(2, size= numquestions * numpeople) == 1 #Produces Error
#correctness = np.random.randint(2, size= numquestions * numpeople) == -1 #all False code runs fine
#correctness = np.random.randint(2, size= numquestions * numpeople) != -1 #all True code throws error message
correctness.shape = (numquestions, numpeople)
# theta (proficiency params) are sampled from a normal distribution
theta = Normal("theta", mu=0, tau=1, value=theta_initial, observed= generating)
# question-parameters (IRT params) are sampled from normal distributions (though others were tried)
a = Normal("a", mu=1, tau=1, value=[[0.0] * numthetas] * numquestions)
# a = Exponential("a", beta=0.01, value=[[0.0] * numthetas] * numquestions)
b = Normal("b", mu=0, tau=1, value=[0.0] * numquestions)
# take vectors theta/a/b, return a vector of probabilities of each person getting each question correct
@deterministic
def sigmoid(theta=theta, a=a, b=b):
bs = repeat(reshape(b, (len(b), 1)), numpeople, 1)
return np.zeros_like(1.0 / (1.0 + exp(bs - dot(a, theta)))) #np.zeros_like fixes error
# take the probabilities coming out of the sigmoid, and flip weighted coins
correct = Bernoulli('correct', p=sigmoid, value=correctness, observed=not generating)
# create a pymc simulation object, including all the above variables
m = MCMC([a,b,theta,sigmoid,correct])
# run an interactive MCMC sampling session
m.isample(iter=20000, burn=15000)
mydict = m.stats()
print(mydict['theta']['mean']) #Get ability parameters for each student
当我运行脚本时,我收到错误消息:
pymc.Node.ZeroProbability: Stochastic correct's value is outside its support,
or it forbids its parents' current values.`
追溯到这一行:
correct = Bernoulli('correct', p=sigmoid, value=correctness, observed=not generating)
我检查了原始脚本(在从潜在值生成结果和从结果中计算潜在值之间切换)和correctness
变量,我认为它是上述测试结果的X矩阵,充满了False
值。当我将correctness
设置为满False
个值时,脚本就会完成。然而,这似乎表明每个学生都错了每一个问题,这不会有多大意义。我认为这可能是问题的正确答案,因此我将correctness
中的所有值设置为True
,但这会产生相同的错误。我做错了什么?如何使用IRT模型从X矩阵估计潜在能力是否学生使用pymc在测试问题上得到了正确的答案?
答案 0 :(得分:6)
你已经被Python的一个偷偷摸摸的部分所困扰。 pymc
的全局导入将numpy
exp
替换为其他exp
。要获得所需的exp
,您可以在np.exp
确定性中使用sigmoid
。 (np.
来自哪里,我想知道?)
return np.exp(1.0 / (1.0 + np.exp(bs - dot(a, theta))))
看起来你还有一些调试要做,但我希望这会让你失意。这是我赞成这种模式的一个很好的例子:
import numpy as np, pymc as pm