Question

我正在尝试获取线性模型（如下）的摘要统计信息（summary()），该模型使用原始数据集的1000个排列来创建1000个随机数据集（大矩阵）。

random_model <- rep(NA,1000)

for (i in c(1:1000)) {
  random_data <- final_data
  random_data$weighted_degree <- rowSums(node.perm_1000[i,,],na.rm=T)
  random_model[i] <- coef(lm(weighted_degree ~ age + sex + age*sex, data=random_data))
}

我不是简单地尝试比较模型以获得总体p值，而是想为模型中也使用随机排列的每个变量获取t值。

Answer 1

尝试使用tidy()软件包中的broom。它返回这样的期望值（示例）：

# A tibble: 2 x 5
  term             estimate std.error statistic  p.value
  <chr>               <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)         6.53      0.479     13.6  6.47e-28
2 iris$Sepal.Width   -0.223     0.155     -1.44 1.52e- 1

在您的情况下，将根据您的定义为循环列表中的每个元素存储先前的输出：

library(broom)
#Data
random_model <- rep(NA,1000)
#Loop
for (i in c(1:1000)) {
  random_data <- final_data
  random_data$weighted_degree <- rowSums(node.perm_1000[i,,],na.rm=T)
  random_model[i] <- broom::tidy(lm(weighted_degree ~ age + sex + age*sex, data=random_data))
}

Answer 2

您应该将感兴趣的结果（估计的系数和t值）存储在列表中。

这是一个可重现的示例，在mtcars数据集上使用10个重复，每个重复以50％的速率采样。

使用$coefficients对象上输出的summary()的{{1}}属性检索感兴趣的结果。

lm

例如，适合复制1和复制10的模型是：

# The data
data(mtcars)

# Define sample size of each replication
N <- nrow(mtcars)
sample_size <- floor(N/2)

# Number of replications (model fits) and initialization of the list to store the results 
set.seed(1717)
replications <- 10
random_model <- vector( "list", length=replications )
for (i in seq_along(random_model)) {
  shuffle = sample(N, sample_size)
  mtcars_shuffle = mtcars[shuffle, ]
  random_model[[i]] <- summary(lm(mpg ~ cyl + disp + cyl*disp, data=mtcars_shuffle))$coefficients
}

使用R中的置换值从模型中获取摘要统计信息

2 个答案: