Question

我有三种形式的数据。

数据框，info.data为

id.num <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 20, 21, 22, 23, 25, 30, 31, 32, 33, 34, 35) 
id.name <- c("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fifteen", "twenty", "tyone", "tytwo", "tythre","tyfive", "thrty", "thrtyone", "thrtytwo", "thrtythree", "thrtyfour", "thrtyfiv") 
info.data <- data.frame(id.num, id.name) 
row.names(info.data)<- c("x1","x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x15", "x20", "x21", "x22", "x23", "x25","x30", "x31","x32", "x33", "x34","x35")

矩阵mat，其中一些常见的rownames为info.data，

mat <- matrix(c(sample(0:1, 100, replace=T)), nrow=10, ncol=10)
diag(mat)<-0
t2 <- lower.tri(mat)
mat[lower.tri(mat)] <- t(mat)[lower.tri(mat)]
row.names(mat) <- c(paste("x",3:12,sep=""))
colnames(mat)<-c(paste("x",3:12,sep=""))

和list，req.l，id.names info.data。

req.l<- list(L1=info.data$id.name[2:8],LL1=(info.data$id.name[1:5]),LLL1=(info.data$id.name[8:21]))

我想选择一个列表，比如LL1，以及来自mat的子集对应矩阵（无论哪个值存在），这样输出就是一个子集（相应的列表值为col / row）姓名）跟随，

          three four  five
three        0      0     1
four         0      0     0 
five         1      0     0

我尝试在几行中使用%in%，因此代码变得冗长。此外，我每次都需要更改列表名称等，这会造成混乱，从而使我的大脑停止!! 有没有一种巧妙的方法来完成这样的任务？在这种情况下可以使用grep吗？

Answer 1

必须有更好的方法，但这似乎也是有效的：

lapply(req.l, 
       function(X) {
          tmp = rownames(info.data)[match(X, info.data$id.name)]
          dmnms = replicate(2, as.character(X[tmp %in% unique(unlist(dimnames(mat)))]), simplify = F)
          ret = do.call("[", c(list(mat), 
                               lapply(dimnames(mat), 
                                         function(x) 
                                            na.omit(match(tmp, x)))))
          dimnames(ret) = dmnms
          ret
       })
#$L1
#      three four five six seven eight
#three     0    0    0   0     0     1
#four      0    0    1   0     0     0
#five      0    1    0   1     1     0
#six       0    0    1   0     1     1
#seven     0    0    1   1     0     0
#eight     1    0    0   1     0     0
#
#$LL1
#      three four five
#three     0    0    0
#four      0    0    1
#five      0    1    0
#
#$LLL1
#       eight nine ten eleven twelve
#eight      0    0   0      1      0
#nine       0    0   1      0      1
#ten        0    1   0      1      1
#eleven     1    0   1      0      1
#twelve     0    1   1      1      0

Answer 2

有几个步骤可以跳过这里，但让我们分解

首先，我们需要在info.data中找到我们选择的列表中的值。我们可以用

做到这一点

info.data$id.name %in% req.l[["L1"]]

现在我们需要找到这些值对应的行名称，因为它们是矩阵中的名称。

rownames(info.data)[info.data$id.name %in% req.l[["L1"]]]

那样做。现在我们只想要那些也在矩阵中的名字，所以我们只需要重叠值

intersect(
    rownames(info.data)[info.data$id.name %in% req.l[["L1"]]], 
    colnames(mat)
)

这最终是我们想要来自mat的行/列的列表。现在我们可以进行子集化

mc <- intersect(
    rownames(info.data)[info.data$id.name %in% req.l[["L1"]]], 
    colnames(mat)
)
mat[mc,mc]

然后我们需要重命名维度，所以在这里我们回到data.frame来获取它们

out <- mat[mc,mc]
dimnames(out) <- replicate(2, info.data[mc,"id.name"], simplify=F)
out

由于这完全基于字符串＆＃34; L1＆＃34;，您可以轻松地将该值替换为您想要的或变量。

使用列表，数据框和矩阵时进行子设置

2 个答案: