我有一个数据框x,其顺序如下:
date c1 c2 c3 c4 c5 c6 c7 c8 c9
Jan-08 12 23 12 11 10 1 49 34 23
Feb-08 14 33 11 11 20 11 29 44 23
以此类推...
我还有另一个具有的二进制矩阵
1 3 6
1 0 0 1
2 0 0 0
3 0 1 0
4 1 0 0
5 0 1 0
6 1 0 0
7 0 0 0
8 1 1 0
9 0 1 1
我想看一下我的二进制矩阵,并为我的二进制矩阵中的每一列创建一个新表,以便新表仅容纳数据帧x中二进制表中为1的列。 因此,我们将在此处创建3个数据帧,即data_frame_1,data_frame_3和data_frame_6,其中data_frame_1的格式为
date c4 c6 c8
Jan-08 11 1 34
Feb-08 11 11 44
data_frame_3将是
date c3 c5 c8 c9
Jan-08 12 10 34 23
Feb-08 11 20 44 23
答案 0 :(得分:1)
使用lapply
,我们可以遍历二进制矩阵mat
的列,并将二进制矩阵转换为逻辑向量,该逻辑向量用于对x
数据帧的列进行子集化。
lapply(1:ncol(mat), function(i) cbind(x[1], x[-1][as.logical(mat[, i])]))
#[[1]]
# date c4 c6 c8
#1 Jan-08 11 1 34
#2 Feb-08 11 11 44
#[[2]]
# date c3 c5 c8 c9
#1 Jan-08 12 10 34 23
#2 Feb-08 11 20 44 23
#[[3]]
# date c1 c9
#1 Jan-08 12 23
#2 Feb-08 14 23
答案 1 :(得分:0)
您可以使用apply
遍历二进制矩阵bin
的列,子集数据帧dat
:
# create test data
set.seed(1)
dat <- as.data.frame(matrix(rnorm(18), nrow=2))
colnames(dat) <- paste0('c', 1:9)
dat
# c1 c2 c3 c4 c5 c6 c7 c8
# 1 -0.6264538 -0.8356286 0.3295078 0.4874291 0.5757814 1.5117812 -0.6212406 1.12493092
# 2 0.1836433 1.5952808 -0.8204684 0.7383247 -0.3053884 0.3898432 -2.2146999 -0.04493361
# c9
# 1 -0.01619026
# 2 0.94383621
bin <- matrix(sample(0:1, 27, replace = TRUE), nrow = 9)
bin
# [,1] [,2] [,3]
# [1,] 1 1 0
# [2,] 0 0 0
# [3,] 1 0 0
# [4,] 0 1 1
# [5,] 1 1 1
# [6,] 1 0 0
# [7,] 1 1 1
# [8,] 1 0 0
# [9,] 1 0 0
# subset columns of dat, using binary vector columns defined in bin;
# drop = FALSE is included to prevent any columns with only a single "1" from
# being cast to a vector
apply(bin, 2, function(x) { dat[, as.logical(x), drop = FALSE] })
# [[1]]
# c1 c3 c5 c6 c7 c8 c9
# 1 -0.6264538 0.3295078 0.5757814 1.5117812 -0.6212406 1.12493092 -0.01619026
# 2 0.1836433 -0.8204684 -0.3053884 0.3898432 -2.2146999 -0.04493361 0.94383621
#
# [[2]]
# c1 c4 c5 c7
# 1 -0.6264538 0.4874291 0.5757814 -0.6212406
# 2 0.1836433 0.7383247 -0.3053884 -2.2146999
#
# [[3]]
# c4 c5 c7
# 1 0.4874291 0.5757814 -0.6212406
# 2 0.7383247 -0.3053884 -2.2146999
#