Question

关于predict.coxph()输出的解释有一个很棒的post。但是，通过比较predict.coxph，simPH的输出和相对风险的公式，我一直得到不同的结果。由于我的假设包括二次效应，因此在我的示例中将包含幂为2的多项式。

我使用this帖子中的示例。

data("lung")

使用predict（）预测相对风险

# Defining the quadratic predictor
lung$meal.cal_q <- lung$meal.cal^2

# conduct a cox regression with the predictor meal.cal, its quadratic version and some covariates.
cox_mod <- coxph(Surv(time, status) ~
                 ph.karno + pat.karno + meal.cal + meal.cal_q,
                 data = lung)

# a vector of fitted values to predict for
meal.cal_new <- seq(min(lung$meal.cal, na.rm= TRUE), max(lung$meal.cal, 
na.rm= TRUE), by= 1)

# a vector of fitted values to predict for, the quadratic effect
meal.cal_q_new <- meal.cal_new^2

# the length of the vector with the values to predict for
n <- length(meal.cal_new)

# a dataframe with all the values to predict for
lung_new <- data.frame(ph.karno= rep(mean(lung$ph.karno, na.rm= TRUE), n), 
                       pat.karno= rep(mean(lung$pat.karno, na.rm= TRUE), n), 
                       meal.cal= meal.cal_new, 
                       meal.cal_q = meal.cal_q_new)

# predict the relative risk
lung_new$rel_risk <- predict(cox_mod, lung_new,  type= "risk")

使用公式预测相对风险（请参见上述post）

# Defining the quadratic predictor
lung$meal.cal_q <- lung$meal.cal^2

# run a cox regression with the predictor meal.cal, its quadratic version and some covariates.
cox_mod <- coxph(Surv(time, status) ~
               ph.karno + pat.karno + meal.cal + meal.cal_q,
             data = lung)

# a vector of fitted values to predict for
meal.cal_new <- seq(min(lung$meal.cal, na.rm= TRUE), max(lung$meal.cal, 
                                                     na.rm= TRUE), by= 1)

# a vector of fitted values to predict for, the quadratic effect
meal.cal_q_new <- meal.cal_new^2

# length of the vector to predict for
n <- length(meal.cal_new)

# A dataframe with the values to make the prediction for
lung_new2 <- data.frame(
             ph.karno= rep(mean(lung$ph.karno, na.rm= TRUE), n), 
             pat.karno= rep(mean(lung$pat.karno, na.rm= TRUE), n), 
             meal.cal= meal.cal_new, 
             meal.cal_q = meal.cal_q_new)

# A dataframe with the values to compare the prediction with
lung_new_mean <- data.frame(
                 ph.karno= rep(mean(lung$ph.karno, na.rm= TRUE), n), 
                 pat.karno= rep(mean(lung$pat.karno, na.rm= TRUE), n), 
                 meal.cal= rep(mean(lung$meal.cal, na.rm= TRUE), n), 
                 meal.cal_q = rep(mean(lung$meal.cal_q, na.rm= TRUE), n))

# extract the coefficients
coefCPH <- coef(cox_mod)

# make the prediction for the values of interest
cox_risk <-
exp(coefCPH["ph.karno"] * lung_new2[ , "ph.karno"] +
    coefCPH["pat.karno"] * lung_new2[ , "pat.karno"] +
    coefCPH["meal.cal"] * lung_new2[ , "meal.cal"] +
    coefCPH["meal.cal_q"] * lung_new2[ , "meal.cal_q"])

# make the predictions for the values to compare with
cox_risk_mean <-
exp(coefCPH["ph.karno"] * lung_new_mean[ , "ph.karno"] +
    coefCPH["pat.karno"] * lung_new_mean[ , "pat.karno"] +
    coefCPH["meal.cal"] * lung_new_mean[ , "meal.cal"] +
    coefCPH["meal.cal_q"] * lung_new_mean[ , "meal.cal_q"])

# calculate the relative risk
lung_new2$rel_risk <- unlist(cox_risk)/ unlist(cox_risk_mean)

使用predict()并使用以下公式，将情节与预测的相对风险一起使用：

ggplot(lung_new, aes(meal.cal, rel_risk)) +
       geom_smooth() +
       geom_smooth(data= lung_new2, col= "red")

该图表明预测是不同的。我不明白为什么会这样，尽管mentioned post表明预测函数和公式应该给出相同的结果。

由于这种混乱，我尝试使用simPH软件包解决问题。这是我所做的：

# Defining the quadratic predictor
lung$meal.cal_q <- lung$meal.cal^2

# run a cox regression with the predictor, its quadratic version and some covariates.

cox_mod <- coxph(Surv(time, status) ~
                 ph.karno + pat.karno + meal.cal + meal.cal_q,
                 data = lung)

# a vector of fitted values to predict for
meal.cal_new <- seq(min(lung$meal.cal, na.rm= TRUE),
                    max(lung$meal.cal, na.rm= TRUE), by= 1)

# length of the vector to predict for
n <- length(meal.cal_new)

# A vector with the values to compare the prediction with
meal.cal_new_mean <- rep(mean(lung$meal.cal, na.rm= TRUE), n)

# running 100 simulations per predictor value with coxsimPoly
Sim <- coxsimPoly(obj= cox_mod, b = "meal.cal", pow = 2,
                  qi = "Relative Hazard",
                  Xj = meal.cal_new,
                  Xl = meal.cal_new_mean,
                  ci = .95,
                  nsim = 100,
                  extremesDrop = FALSE)

# plot the result
simGG(Sim)

这会给出一个带有警告的空图

Warning messages:
1: In min(obj$sims[, x]) : no non-missing arguments to min; returning Inf
2: In max(obj$sims[, x]) : no non-missing arguments to max; returning -Inf

Sim $ sims对象确实为空。

我的问题是：

predict()的结果和公式的使用为何不同？
为什么simPH包不计算相对风险？
我应该选择哪种方法？我的假设是cox回归中的二次效应，就像示例中一样，我需要针对该预测变量及其相对风险（相对于预测变量处于平均值）进行绘图。

Answer 1

simPH 问题的快速解答：需要使用coxph函数在I调用中指定多项式项，例如：

cox_mod <- coxph(Surv(time, status) ~
                 ph.karno + pat.karno + meal.cal + I(meal.cal^2),
             data = lung)

（用例中的错误处理非常差。）

在上面的代码中使用此修改（和1000个模拟）应该返回以下内容：

simPH和`predict`之间的差异

我对差异的猜测是，simPH不会围绕predict之类的转换点估计值创建置信区间。它从拟合模型指定的多元正态分布中进行模拟，然后显示此模拟分布的中心50％和95％。中心线只是模拟人生的中位数。显然，它是与predict不同的逻辑。对于像这样的非常非单调的关注量，与simPH相比，predict点估算值会产生极具误导性的结果。基于4个观察结果，这种形式的证据很少。

使用predict.coxph，simPH和公式预测相对风险

1 个答案:

simPH和`predict`之间的差异

使用predict.coxph，simPH和公式预测相对风险

1 个答案:

simPH和predict之间的差异

simPH和`predict`之间的差异