Question

我是PK建模和pymc3的新手，但我一直在玩pymc3并试图将一个简单的PK模型作为我自己学习的一部分。特别是一个捕捉这种关系的模型......

其中C（t）（Cpred）在时间t是浓度，剂量是给定的剂量，V是分布的体积，CL是清除。

我已经生成了一些测试数据（30个受试者），其中CL = 2，V = 10，3个剂量100,200,300，并在时间点0,1,2,4,8,12生成数据，还包括一些CL上的随机误差（正态分布，0均值，ω= 0.6）和残余未解释误差DV = Cpred + sigma，其中sigma正态分布为SD = 0.33。此外，我已经包括关于重量的C和V的变换（均匀分布50-90）CLi = CL * WT / 70; Vi = V * WT / 70。

# Create Data for modelling
np.random.seed(0)
# Subject ID's
data = pd.DataFrame(np.arange(1,31), columns=['subject'])
# Dose
Data['dose'] = np.array([100,100,100,100,100,100,100,100,100,100,
                    200,200,200,200,200,200,200,200,200,200,
                    300,300,300,300,300,300,300,300,300,300])
# Random Body Weight
data['WT'] = np.random.randint(50,100, size =30)
# Fixed Clearance and Volume for the population
data['CLpop'] =2
data['Vpop']=10
# Error rate for individual clearance rate
OMEGA = 0.66
# Individual clearance rate as a function of weight and omega
data['CLi'] = data['CLpop']*(data['WT']/70)+ np.random.normal(0, OMEGA )
# Individual Volume as a function of weight
data['Vi'] = data['Vpop']*(data['WT']/70) 

# Expand dataframe to account for time points
data = pd.concat([data]*6,ignore_index=True )
data = data.sort('subject')
# Add in time points
data['time'] = np.tile(np.array([0,1,2,4,8,12]), 30)

# Create concentration values using equation
data['Cpred'] = data['dose']/data['Vi'] *np.exp(-1*data['CLi']/data['Vi']*data['time'])
# Error rate for DV
SIGMA = 0.33
# Create Dependenet Variable from Cpred + error
data['DV']= data['Cpred'] + np.random.normal(0, SIGMA )

# Create new df with only data for modelling...
df = data[['subject','dose','WT', 'time', 'DV']]

为模型创建阵列...

# Prepare data from df to model specific arrays
time = np.array(df['time'])
dose = np.array(df['dose'])
DV = np.array(df['DV'])
WT = np.array(df['WT'])
n_patients = len(data['subject'].unique())
subject = data['subject'].values-1

我在pymc3中构建了一个简单的模型....

pk_model = Model()

with pk_model:
    # Hyperparameter Priors     
    sigma = Lognormal('sigma', mu =0, tau=0.01)

    V = Lognormal('V', mu =2, tau=0.01)
    CL = Lognormal('CL', mu =1, tau=0.01)    

    # Transformation wrt to weight
    CLi = CL*(WT)/70    
    Vi = V*(WT)/70

    # Expected value of outcome
    pred = dose/Vi*np.exp(-1*(CLi/Vi)*time)

    # Likelihood (sampling distribution) of observations
    conc = Normal('conc', mu =pred, tau=sigma, observed = DV)

我的期望是我应该能够从数据中解析最初用于生成数据的常量和错误率，尽管我无法做到这一点，尽管我可以接近。在这个例子中......

data['CLi'].mean()
> 2.322473543135788
data['Vi'].mean()
> 10.147619047619049

跟踪显示......

所以我的问题是......

我的代码是否正确构建，是否有任何我忽略的明显错误可能会导致这种差异？
我可以构建pymc3模型以更好地反映我生成数据的关系吗？
您对改进模型的建议是什么？

提前致谢！

Answer 1

我要回答我自己的问题！

但是我按照这里的例子实现了一个层级模型......

GLM -hierarchical

它有效。另外，我注意到我在数据框中应用错误的方式存在错误 - 应该使用

data['CLer'] = np.random.normal(scale=OMEGA, size=30)

确保每个主题具有不同的错误值

PyMC3 PK建模。模型无法解析用于创建数据集的参数

1 个答案: