我正在尝试在Stan中实现层次混合模型,该模型描述了任务的性能如何随时间变化。在模型中(参见下面的代码),假设有三个较低级别的参数是从两个法线的混合中提取的(dperf_int,dperf_sd和sf)。为了实现这些,我只是假设这些较低级别参数的先验是混合。还有两个参数(perf1_int和perf1_sd)仅在组级别进行估算。对于这些,我假设观察本身是从正常的混合物中提取的。
data {
int<lower=0> Nsubj; //Number of subjects (60)
int<lower=0> Nobs; //Number of observations per subject (10)
matrix[Nsubj,Nobs] perf; //Matrix of performance observations
}
parameters {
vector<lower=0,upper=1>[Nsubj] mixing_prop; // mixing proportion for each subject
ordered[2] perf1_int; //mean performance at time 1 (estimated at group level only)
vector<lower=0>[2] perf1_sd; //sd of performance at time 1 (estimated at group level only)
ordered[2] dperf_int_mean; //mean of performance change intercept for each mixture
vector<lower=0>[2] dperf_int_sd; //sd of performance change intercept for each mixture
vector<lower=0>[2] dperf_sd_mean; //mean of performance change sd
vector<lower=0>[2] dperf_sd_sd; //sd of performance change sd
vector[2] sf_mean; //mean self-feedback for each mixture
vector<lower=0>[2] sf_sd; //sd of self-feedback for each mixture
//subject level parameters
vector[Nsubj] dperf_int;
vector<lower=0>[Nsubj] dperf_sd;
vector[Nsubj] sf;
}
model {
real perf_change;
for(i in 1:Nsubj){
vector[4] increments;
increments[1]=log_mix(mixing_prop[i],
normal_lpdf(dperf_int[i] | dperf_int_mean[1], dperf_int_sd[1]),
normal_lpdf(dperf_int[i] | dperf_int_mean[2], dperf_int_sd[2]));
increments[2]=log_mix(mixing_prop[i],
normal_lpdf(dperf_sd[i] | dperf_sd_mean[1], dperf_sd_sd[1]),
normal_lpdf(dperf_sd[i] | dperf_sd_mean[2], dperf_sd_sd[2]));
increments[3]=log_mix(mixing_prop[i],
normal_lpdf(sf[i] | sf_mean[1], sf_sd[1]),
normal_lpdf(sf[i] | sf_mean[2], sf_sd[2]));
increments[4]=log_mix(mixing_prop[i],
normal_lpdf(perf[i,1] | perf1_int[1], perf1_sd[1]),
normal_lpdf(perf[i,1] | perf1_int[2], perf1_sd[2]));
target+= log_sum_exp(increments);
for(j in 2:Nobs){
perf_change = dperf_int[i] + perf[i,j-1]*sf[i];
perf[i,j] ~ normal( perf[i,j-1] + perf_change, dperf_sd[i]);
}
}
}
我很确定我已经正确地实现了组级参数的混合,因为我刚刚完成了Stan手册中的内容。但是,我想检查一下我是否正确完成了模型的层次混合组件。该模型效率很低。我在4000次迭代中运行了4个链,并且许多参数都具有&lt; 10 n_eff。所以我怀疑我错过了什么。
以下是用于运行模型的代码:
library(tidyverse)
library(rstan)
#load data
load(file="wide_data.RData")
#prepare data for stan
model_data = list(
Nsubj = dim(wide_data)[1],
Nobs = 10,
perf = as.matrix(select(wide_data ,perf1:perf10))
)
fit = stan(file="stan_univariate_hierarchical_mixture.stan",
data=model_data,cores=4,
thin=10,
warmup=1000,
chains=4,
iter=4000)
......数据在这里:https://www.dropbox.com/s/eqzw1lou6uba8i3/wide_data.RData?dl=0
非常感谢任何帮助!