我试图根据两个连续变量将数据框中的观察分为36组。更具体地说,我试图将两个变量中的每一个分成六组,然后将观察分组到36个不同的可能组中的一组中。
我的尝试在下面,这是有效的。但有没有更快的方法来避免双循环?
此外,这不是必要的,但是如何在6乘6网格中可视化每组中的观察总数?我知道table()会产生36个可能的组及其总数的列表,但不是网格格式。
set.seed(123)
x1 <- rnorm(1000)
x2 <- rnorm(1000)
data <- data.frame(x1,x2)
labs1 <- levels(cut(x1, 6))
ints1 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs1)),
upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs1)))
labs2 <- levels(cut(x2, 6))
ints2 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs2)),
upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs2)))
tmp <- expand.grid(labs1, labs2)
groups <- cbind(lower1 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,1])),
upper1 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,1])),
lower2 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,2])),
upper2 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,2])))
for (i in 1:1000){
for (j in 1:36){
if (x1[i] >= groups[j,1] & x1[i] <= groups[j,2] &
x2[i] >= groups[j,3] & x2[i] <= groups[j,4]){
data$group[i] <- j
}
}
}
答案 0 :(得分:0)
您可以使用混合的apply()
,它将遍历您的data.frame
和which()
,它将通过您的群组array
进行迭代:
data$group <- apply(data, 1, FUN=function(dataRow)
which(
dataRow[1] >= groups[,1] &
dataRow[1] <= groups[,2] &
dataRow[2] >= groups[,3] &
dataRow[2] <= groups[,4]))
答案 1 :(得分:0)
你是在思考问题。获得6x6表格是一行table()
。 (直接使用由cut(..., 6)
创建的有用因子变量,不要只丢弃该因子,然后手动重新应用其级别并将变量加起来):
with(data, table(cut(x1, 6), cut(x2, 6)))
(-3.05,-1.97] (-1.97,-0.902] (-0.902,0.171] (0.171,1.24] (1.24,2.32] (2.32,3.4]
(-2.82,-1.8] 2 10 11 7 3 0
(-1.8,-0.793] 1 26 67 49 19 3
(-0.793,0.216] 12 57 140 146 31 3
(0.216,1.22] 11 49 109 95 36 6
(1.22,2.23] 0 10 31 34 15 0
(2.23,3.25] 0 3 5 6 2 1
# and to get the wide lines, you may need...
options('width'=199)
# or if you want more compact labels to keep it all narrow, use `cut(..., dig.lab)`
with(data, table(cut(x1, 6, dig.lab=2), cut(x2, 6, dig.lab=2)))
(-3.1,-2] (-2,-0.9] (-0.9,0.17] (0.17,1.2] (1.2,2.3] (2.3,3.4]
(-2.8,-1.8] 2 10 11 7 3 0
(-1.8,-0.79] 1 26 67 49 19 3
(-0.79,0.22] 12 57 140 146 31 3
(0.22,1.2] 11 49 109 95 36 6
(1.2,2.2] 0 10 31 34 15 0
(2.2,3.2] 0 3 5 6 2 1
不可否认,table()
和cut()
的文档都没有直接说明,可以使用像这样的2D示例。 =&GT; DOC /增强-错误