我正在分析一项包含40个人的研究,每个人评定10个小插图。
indiv vign score score2 gender
1 1 5 3 1
1 2 2 4 1
1 3 8 1 1
. . . . .
. . . . .
. . . . .
39 10 9 1 1
40 8 1 5 0
40 9 3 8 0
我想要一个自举,但我很快意识到采样小插图是没有意义的;我们应该取样人员(所以我们每人约10行)。
以下功能有效,但它是下一个功能的瓶颈。 问题是,如何更有效地完成这项工作?
ResampleMultilevel <- function(data, groupvar) {
n <- length(unique(data[,groupvar]))
index <- sample(data[ , groupvar], n, replace = TRUE)
resampled <- NULL # one of the issues is that we do not know
# the size of the matrix yet, since it may vary.
for (i in 1:n) {
resampled <- rbind(resampled, data[data[, groupvar] == index[i], ])
}
return(resampled)
}
子集的问题是我找不到保留重复的方法。
a <- cbind(rep(1:40, each = 10), rep(1:10, 4), rnorm(40), rnorm(40)), rep(1:10, 4), rnorm(40), rnorm(40))
index <- c(1,1)
subset(a, a[,1] == index)
答案 0 :(得分:0)
根据评论,我正在修改答案。
a <- cbind(rep(1:40, each = 10), rep(1:10, 4), rnorm(40), rnorm(40))
index <- c(1, 1, 3, 4, 2)
a[a[, 1] %in% index, ]
## [,1] [,2] [,3] [,4]
## [1,] 1 1 0.28135473 0.47970116
## [2,] 1 2 -0.12628982 0.34862899
## [3,] 1 3 -0.41140740 1.30204100
## [4,] 1 4 -0.61163593 -1.13354157
## [5,] 1 5 -0.31538238 1.42701315
## [6,] 1 6 -0.20403098 2.13989392
## [7,] 1 7 0.37681973 0.65843232
## [8,] 1 8 -0.94062165 0.97246212
## [9,] 1 9 0.63377352 -0.48948273
## [10,] 1 10 -0.39817929 -1.03607028
## [11,] 2 1 0.54866153 -0.55127459
## [12,] 2 2 0.08410140 0.01457366
## [13,] 2 3 -1.19006851 1.33213116
## [14,] 2 4 -0.47210092 0.83369309
## [15,] 2 5 0.75968678 -0.48212390
## [16,] 2 6 -1.00205770 0.56376027
## [17,] 2 7 0.67251644 0.07234657
## [18,] 2 8 0.73165780 -0.51483172
## [19,] 2 9 -0.26022238 2.33181762
## [20,] 2 10 0.03370091 -0.71427295
## [21,] 3 1 0.60810461 0.15054307
## [22,] 3 2 -1.29363706 1.30510127
## [23,] 3 3 -0.20479713 -2.39797975
## [24,] 3 4 -0.86927664 -0.10845738
## [25,] 3 5 0.89040130 -0.08459249
## [26,] 3 6 -0.21511823 1.33960644
## [27,] 3 7 -0.32413278 -0.31691484
## [28,] 3 8 -0.61545941 -0.10457591
## [29,] 3 9 -1.85072358 0.93267270
## [30,] 3 10 0.38456423 0.76231047
## [31,] 4 1 0.76016236 1.63854054
## [32,] 4 2 -0.94463491 1.87271085
## [33,] 4 3 1.62451250 1.63298961
## [34,] 4 4 -1.96908559 0.89058201
## [35,] 4 5 1.66755533 0.10288947
## [36,] 4 6 -0.02182803 -0.91358891
## [37,] 4 7 -0.09382921 -0.54950093
## [38,] 4 8 0.74597002 2.31924468
## [39,] 4 9 0.64732694 0.29681494
## [40,] 4 10 -0.66535049 1.81285111
答案 1 :(得分:0)
a&lt; - index&lt; - 5:10
这几乎可行,除了结构不是我想要的矩阵。
lapply(index, function(x) a[which(a[,1] == x),])
此外,这几乎到了那里,如果有一个非循环方式来做这个会很好,因为这里只适用于数字2:
a[which(a[,1] == 2),] # works
a[which(a[,1] == index), ] # does not work